The Psychology of Music and the ‘tuneR’ Package

Introduction

This semester I’m TA’ing a course on the Psychology of Music taught by Phil Johnson-Laird. It’s been a great course to teach because (i) so much of the material is new to me and (ii) because the study of the psychology of music brings together so many of the intellectual tools I enjoy, including music theory, psychophysics and Fourier analysis.

One topic this semester that was completely new to me was the theory of tuning: I had known about the invention of the well-tempered system of tuning, but had never heard of Pythagorean tuning or just tuning — and certainly was not aware that the well-tempered system Bach celebrated was not identical to our current equal-tempered system of tuning.

As a way of consolidating some of the knowledge I’ve gained, I decided I’d write a blog entry after several months of neglecting this blog. (For that neglect, I’ll blame a combination of grant writing, book writing, ongoing research projects and personal life developments.) In what follows, I’ll give a brief overview of the theory of tuning at a theoretical level that should be accessible to anyone who’s familiar with the names of intervals and feels comfortable thinking quantitatively.

After surveying the field, I’ll turn to a discussion of some code I’ve written in R that implements these ideas using the ‘tuneR’ package, which is one of my favorite hidden gems from CRAN. Along the way, I’ll introduce some of the simplest tools from the ‘tuneR’ package that can be used for generating computer music.

Tuning Systems: Pythagorean, Just and 12-Tet

It’s worth noting right at the start that tuning is a misleading name for the topic we’ll be discussing: we’re not talking about how one tunes a fixed instrument so that it sounds in tune, but rather we’re interested in how one defines the very notes that the instrument should be able to produce when it’s perfectly in tune.

To make that clear, let’s assume that we’ve accepted as a given that a frequency of 440 Hz will be called A. Our problem then becomes one of deciding which of the infinitely many frequencies we could produce actually deserves the label of A#, B, C, C#, and so on.

Pythagorean Tuning

The simplest solution to this problem I know of is the Pythagorean tuning system. It’s based on constructing all of the possible notes using a series of perfect fifths. If you remember the Circle of Fifths, you’ll remember that you can reach every chromatic note by ascending fifths: if you start at A, you’ll proceed through E, B, F# and so on.

The Pythagorean system implements the Circle of Fifths directly using repeated multiplication of a base frequency. To do this, you first declare that a perfect fifth is at a frequency 3/2 above your base frequency. For example, this definition implies that the perfect fifth above the A at 440 Hz has to be at a frequency of 3/2 * 440 = 660 Hz. Once you do this, you’ve defined the frequency we’ll call E.

And following on with this logic, you produce a B at 990 Hz. Of course, this B occurs an octave above the base A at 440 Hz, so you transpose it down an octave to produce the B you’ll actually use. To do this, you need to assume that an octave is at a frequency 2 times the base frequency. Since we’ve accepted that 990 Hz is a B, we divide 990 by 2 and conclude that 495 Hz should be B.

With these three notes defined, we have the following table of frequency/note pairs:

Note Frequency Ratio with 440 Hz
A 440 Hz 1
E 660 Hz 3/2
B 495 Hz 9/8

If we continue on with this logic and calculate many more multiplications by 3/2 and divisions by 2, we will eventually produce a complete table for all of the notes in the chromatic scale that looks like the following:

Note Frequency Ratio
A 440 1
A# 463.5391 256/243
B 495 9/8
C 521.4815 32/27
C# 556.875 81/64
D 586.6667 4/3
D# 626.4844 729/512
E 660 3/2
F 695.3086 128/81
F# 742.5 27/16
G 782.2222 16/9
G# 835.3125 243/128
A 880 2

One thing about this table might strike you as odd if you’re mathematically savvy: the octave, which we’ve defined by fiat as a ratio of 2:1, could never have been produced by successive multiplication by 3/2, since no power of 3 will be evenly divisible by a power of 2. This is the one flub in the Pythagorean system: you can’t really produce the entire chromatic scale using only multiples of 3/2. Here we’ve solved that problem by replacing the note we would have called A with a true octave generated using multiplication by 2. Because the exact octave produced by Pythagorean tuning is slightly out of tune with our preferred definition of an octave, you may hear people refer to this discrepancy as the the Pythagorean comma.

Just Tuning

Given that we had to cheat a bit to create a proper octave using the Pythagorean tuning system based on multiples of 3/2, it makes sense to ask why we shouldn’t just allow ourselves to use other multipliers than 3/2. Looking at the Pythagoren tuning table, we see some pretty ugly fractions like 729/512. What if we forced these fractions to be simpler by employing ratios like 4/3 and 5/4 to build up the whole system?

The result of allowing ourselves several fractions beyond just those derived from 3/2 is called the just tuning system. Here we assume that perfect fifths occur at a frequency ratio of 3/2 and that perfect fourths occur at a frequency ratio of 4/3. Continuing on with this process, we eventually end up with the following tuning table:

Note Frequency Ratio
A 440 1
A# 469.3333 16/15
B 495 9/8
C 528 6/5
C# 550 5/4
D 586.6667 4/3
D# 625.7778 64/45
E 660 3/2
F 704 8/5
F# 733.3333 5/3
G 782.2222 16/9
G# 825 15/8
A 880 2

This is the tuning that early Classical music was written in. Looking at the table you con immediately appreciate the theoretical assertion that the relative dissonance of an interval is determined by the simplicity of the ratio of frequencies between the two notes: perfect fifths are 3/2 and major thirds are 5/4, while minor seconds are 16/15 and major sevenths are 15/8. This is one of the things I most enjoy about the theory of harmony: there’s a match between the aesthetics of fractions and the aesthetics of sounds that, for me, helps to justify my sense that certain fractions are more beautiful than others.

12 Tet / Equal-Temperament

Now, if you know the history of Bach’s Well-Tempered Clavier, you know that there is a problem with the just tuning system: it sounds great in the key you used as the base (here A), but it sounds a bit out of tune in other keys. The modern 12-tet system is the most recent approach to solving this problem: you assume the gap between two semitones (e.g. A to A# or A# to B) is always the exact same multiple. Since you’ll repeat this multiplication 12 times before reaching an octave, you can conclude that two notes that are a semitone apart must be separated by the 12th root of 2. Building a tuning system using that ratio alone gives us our modern system of tuning, which is shown in the table above using the decimal expansion of the ratios instead of their representation as powers of the 12th root of 2:

Note Frequency Ratio
A 440 1.000000
A# 466.1638 1.059463
B 493.8833 1.122462
C 523.2511 1.189207
C# 554.3653 1.259921
D 587.3295 1.334840
D# 622.2540 1.414214
E 659.2551 1.498307
F 698.4565 1.587401
F# 739.9888 1.681793
G 783.9909 1.781797
G# 830.6094 1.887749
A 880 2.000000

Listening to the Results

We’ve just described three ways to define the notes used in Western music. But how different do they sound? To answer that, I decided to produce a series of simple sine wave audio samples that were tuned using each of the three tuning systems. To produce those audio samples, I used the ‘tuneR’ package, which I’ll describe now. Before you read on, you should install it from CRAN using the standard install.packages('tuneR') invocation.

A tuneR Tutorial

The tuneR package is an extremely convenient tool for generating audio files from R based on a numeric description of the audio stream. For the purposes of this discussion of tuning systems, we simply need to produce basic sine waves. Thankfully, that’s very easy to do with tuneR. Here’s an example:

 1 2 3 4 5  library('tuneR')   sound <- sine(440, bit = 16)   writeWave(sound, '440.wav')

Here we’ve loaded the tuneR package, created a 1s snippet of sine wave audio at 16 bits resolution using the sine function, and then written out the audio to a WAV file using writeWave. If you look at your current directory and listen to this file, you’ll hear a sine wave at 440 Hz.

If you want to explore the use of sine, you can easily play with the duration of the sound by changing the duration parameter. If you want to, you can also change the sample rate and the bit rate, but I don’t see any reason to do that while exploring ideas about tuning.

More important is knowing that you can superimpose two sine waves using the + operator and that you can concatenate them using the bind function. To show off producing octaves, for example, you might use the following code to hear an A at 440 Hz, then an A an octave above it, and finally the harmony they produce together:

 1 2 3 4 5 6 7  library('tuneR')   sound <- bind(sine(440, bit = 16), sine(880, bit = 16), sine(440, bit = 16) + sine(880, bit = 16))   writeWave(sound, 'octaves.wav')

Unfortunately, this sample code produces an error because of the naive addition we’ve implemented using the + operator. Adding two sine waves directly together overfills the bit rate we’re using. To safely perform addition of two sine waves, we need to normalize the results of our summation using the normalize function. This gives us just one more line of code:

 1 2 3 4 5 6 7 8 9  library('tuneR')   sound <- bind(sine(440, bit = 16), sine(880, bit = 16), sine(440, bit = 16) + sine(880, bit = 16))   sound <- normalize(sound, unit = '16')   writeWave(sound, 'octaves.wav')

For reasons that are not clear to me, you have to specify the bit rate to normalize using the unit parameter rather than the bit parameter.

Demoing Tuning Systems

Our little octave demo is cute, but we really want to know what more interesting harmonies like major thirds and minor seconds sound like in the various tuning systems we described. To do that, I first wrote a function called interval that spits out the multiplier you need to use to produce a given interval for any of the three tuning systems. That function is in a GitHub repository I’ve set up with code for making these demos. If you download that repository, you could load my interval function using a simple call to source like the one seen below. And using this interval function, we can generate demos of various intervals as follows:

 1 2 3 4 5 6 7 8 9 10 11  library('tuneR') source('interval.R')   base <- 440   sound <- sine(base) + sine(interval('minor-second', tuning = 'pythagorean') * base)   sound <- normalize(sound, unit = '16')   writeWave(sound, 'minor_second_pythagorean.wav')

On GitHub there’s a file called test_intervals.R that will go through and generate all of the intervals in all three tuning systems. If you run that file, you’ll generate a lot of audio files you can listen to as demos of the three tuning systems we’ve described. For me, these tuning systems all produce intervals that sound surprisingly similar, though at high volumes I find it moderately easy to hear slight differences between the tuning systems. That said, I very much doubt I would pick up on them in a normal musical context.

That’s the end of my little introduction to tuning systems and the use of the tuneR package to explore them. If you’re interested in thinking computationally about music, I highly recommend playing around with tuneR until you feel like you can produce interesting results. I’m already working on trying to build up some interesting timbres to work with.

12 responses to “The Psychology of Music and the ‘tuneR’ Package”

1. > Just Tuning

What just tuning is this, specifically? It is close to Ptolemy’s famous just intonation scale, but slightly different.

> This is the tuning that early Classical music was written in.

Citation needed. Also, you mean Classical as in Classical Greece, right?

2. Don’t use sine waves to compare tunings! Collisions between overtones are a huge factor in the dissonance of a given interval and its overall sound, so you need a timbre with overtones to make comparisons. Since most instruments produce a full set of harmonics, sawtooth or pulse waves are good, but triangle and square waves (like a clarinet) can be interesting too.

Also, the statement that just intonation sounds out of tune in some keys isn’t strictly correct. *If* you limit yourself to 12 notes and use a naively constructed tuning some keys will sound “out of tune” although “differently tuned” is probably a better term. When the number of notes isn’t arbitrarily limited, a flexibly pitched instrument can play any key the same, like barbershop quartets are famous for doing.

3. You may be interested in this tool as well:

http://www.huygens-fokker.org/scala/

“Scala is a powerful software tool for experimentation with musical tunings, such as just intonation scales, equal and historical temperaments, microtonal and macrotonal scales, and non-Western scales”

4. Don’t forget Meantone Temperament! It takes the 5-limit Just scale and tempers out the comma 81/80, or the difference between 4 fifths and 1 major third (3/2*3/2*3/2*3/2*1/5 = 81/80). A good example of meantone is in 19-tone equal temperament, where the perfect fifth is slightly flat, meaning that the major third will be closer to 5/4. We can then reduce the 2 dimensions of just intonation (ratios of 3 and ratios of 5) into a single dimension (~696 cents, or 1.495 as a decimal) with a minimal amount of error.

http://xenharmonic.wikispaces.com/Meantone

5. Unless I’ve misunderstood something, it seems as though you’re saying we jumped straight from 5-limit JI to 12-note well temperaments, such as Werckmeister and such. However, there’s an all-important intermediate step that you’re missing, which is that first we adopted “meantone” temperament: the temperament that vanishes 81/80.

This isn’t just a compromise that we made for the sake of having more harmonies with less notes; it actually created fundamental structural changes to Western music that are difficult to get away from. For example, consider the chord progression Cmaj | Am | Dm | Gmaj | Cmaj. If Cmaj -> Am is motion down by 6/5, and then Am -> Dm and Dm -> Gmaj and Gmaj -> Cmaj are all motion upward by 4/3, then once you transpose back within the octave you’ll end up the JI interval of 81/80 lower than when you started. Since such “JI-violating” chord progressions are so common (they’re technically called “comma pumps”), this temperament is more or less fundamental to the structure of all western music; it was only after adopting quarter-comma meantone that we finally adopted things like Werckmeister and ultimately 12-equal (both of which support meantone temperament).

In general, most of the active development of microtonal theory is taking place in one of these three locations:

Hope you find some of this info useful.

6. You also hurry a bit too fast over well-temperaments there. Well-temperaments, contrary to popular belief, are distinct from equal temperaments. Well-temperaments are temperaments where the commas are unequally distributed in a way that makes all keys playable, but all keys different- some key has, say, a slightly flatter third degree, some other has a slightly sharper seventh, etc. Werckmeister’s different tunings- the ones he developed in collaboration with Bach – actually are such tunings. Essentially all keys get their own characteristics – something that might explain why composers of the classical era considered some keys sadder, some keys more happy, some keys more majestic, etc.

Of course, the size of the semitones in such a well-temperament will average* to 2^(1/12), (but so will the steps in a 12-tone pythagorean scale with a wolf), but the deviations and variances from 2^(1/12) will be smaller. (If the error is cleverly distributed, all intervals will get closer to some 2^(x/12).)

The idea that well-temperaments and equal-temperaments are the same thing appeared in some music theory book during the late 19th or early 20th century, and has since been quite difficult to get rid of.

Meantone, of course, tries to hit 5/4 better than it tries to hit 3/2. 5/4 has a more distant reasonable comma wrt the octave, that is, you need to go something like (5/4)^31 or somesuch to get a reasonably close interval. To get a good 5/4, you average (well, not exactly, but close enough) the two seconds needed to get there – 9/8 and 10/9, and to do this, you need to have a slightly detuned fifth – about half as flat as your new second is flat compared to 9/8. Some archicembalos and organs with up to 36 tones per octave have been built for complete meantone scales – during the renaissance by Nicola Vicentino, and during the 20th century by Adrian Fokker. Meantone on instruments with fewer notes will of course leave some keys unusable.

* not the normal arithmetic average, necessarily, but rather some kind of geometric average.

7. > I actually meant that it was the tuning used prior to the invention of the well-tempered system. But feel free to correct me and provide a citation for a more accurate statement. My understanding is that we don’t quite know which tuning the Greeks used; the Pythagorean tuning may actually be a Medieval invention from what I know.

There were LOTS of tuning systems before the well-tempered ones.

http://www.amazon.com/Tuning-Temperament-Historical-Murray-Barbour/dp/0486434060/

That book includes several ancient Greek systems.

We don’t exactly know a lot about the actual musical practice in ancient Greece, but we have a good number accounts of their theoretical frameworks, including their ideals of tuning. I guess it’s safe to say reality could be different from those, but it’s hard to tell how different.

The ancient Greek tuning systems were very different from our modern “standard” one. They worked with tetrachords, and those were stacked. The distance between the lowest and highest note was a diatessaron (4:3). That was the fixed part, the interval between the two middle notes was not fixed. There were mainly three types of tetrachords (the difference was on the intervals that subdivided the diatessaron), but there are several accounts of the divisions of those three. Stack several of those and you get different systems.

http://music.dartmouth.edu/~larry/published_articles/divisions_of_the_tetrachord/index.html

http://www.amazon.com/Manual-Harmonics-Nicomachus-Pythagorean/dp/0933999437

http://www.amazon.com/Music-Ancient-Greece-Rome-Landels/dp/0415248434/

http://www.archive.org/details/treatiseonbyzant00hathuoft

8. Hi John! Sorry to revive this topic, but I’m was looking for some instructions in how using tuneR package to handle audio files in R and found your website (which is very cool, by the way).

I’m a total beginner, so I would like to apologize beforehand if I’m asking a silly question:

I’m trying to export a .wav file using the writeWave function from tuneR package in a different folder than the default getwd().
After reading through the manuals of some audio packages I couldn’t figure it out.

I’m picking one 3-hour .wav file and asking the function to take a sample of 1 minute (from minute 100 to minute 101 of the 3-hour file) and saving it in an object “rec”.

Here is what I’m doing:

filename<-"QUm0B24x_SMS05_20100831_052100.wav"
min.ini<-100
min.fin<-101

library(tuneR)
writeWave(rec,"cut1.wav")

It is writing the file in the same directory of the "filename" object. What I want to do is to write the "cut1.wav" file in a different folder.

Any suggestions?

P.S: Sorry for any possible mistakes in my english writing…greetings from Brazil!

9. Hi Humberto,

I don’t have time at the moment to try this out, but there are two obvious potential culprits:

(1) You’re not telling writeWave the absolute path of the file you want it to output
(2) writeWave is broken and won’t write out files to a directory other than the current one

I would bet that (1) is the source of your troubles.

— John