Sound, harmonies, the 12-tone scale, and its alternatives

I have been mulling over how to write this post for several days now. A good friend of mine, and one of the brightest minds I know, confessed recently that he did not understand the mathematics behind the difference between Pythagorean and modern 12-tone scales. I suppose most of my readers don’t even know the difference at all, and it seems to me to be worth knowing (though I can’t clearly state why). But it is not an easy thing to describe. Hopefully his post will be more helpful than it is confusing.

Sound

Sound is something many people don’t really understand, so let us start by describing what it is.

I am intentionally ignoring many details here. For example, my definition of pitch should mean that a chord has a pitch lower than any of its constituent notes. I can explain many of these nuances, but this isn’t the post for that. Add in enough “‍approximately‍”s and “‍generally‍”s and what I say is accurate.

Sound is cyclically changing air pressure. The rate at which it cycles is pitch, the difference between high and low pressure is volume, and the pattern of pressures in each cycle is the “‍quality‍” of the sound—the difference between the sound of a violin and a trumpet, for example.

Consider a plucked string. The string will wiggle all around as it moves, but at any given moment any given part of the string will be moving in some direction. In front of the string as it accelerates the air will be pressurized and behind it the air will be depressurized. Mostly those two pressures will cancel out: the air will flow from the high- to the low-pressure part of the string and that’s the end of it. But some of that high pressure will travel outward from the string and then, as the string reverses its motion, a similar low-pressure will travel outward, creating a sound (cyclically changing pressure) near the string. If you put a nice flat surface that can vibrate near the string, such as the sound-board of a guitar, that flat surface be pushed away in the high pressure and pulled back in the low. Since it’s big and flat it is much harder for the high-pressure air on one side of the moving surface to rush around to fill in the low-pressure air on the other side, so that sound will travel through the air much farther than it would from a string alone.

There is much more to the physics of sound itself, but that’s enough to work with. As an object moves, air is pressurized and then depressurized in front of it. Because of the momentum of the air itself, both the high- and low-pressure fronts travel through the air in the same direction instead of simply canceling each other out, and that means that the air in our ears oscillates between high and low pressure.

Octaves and Harmonics

It is common to refer to the pitch of sound in Hertz. The Hertz of a sound (or of any other cyclic pattern), is the number of times it performs a complete cycle in one second. 440Hz (the typical A above middle C) means the air goes from high pressure to low pressure and back to high pressure four hundred and forty times every second. The pendulum of a grandfather clock swings at 1Hz, ticking twice each swing for a 2Hz tick. Sunlight cycles at about 0.0000116Hz; seasons cycle at about 0.0000000317Hz.

One sound is said to be one octave higher (or lower) than another if it has exactly twice (or half) the Hz. There are eight A keys on a piano keyboard, being (from lowest to highest) 27.5Hz, 55Hz, 110Hz, 220Hz, 440Hz, 880Hz, 1760Hz, and 3520Hz.

As an aside, there is a physical reason we can’t hear all pitches. Very high sounds (more than about 40 thousand Hz) switch from high to low pressure so quickly that the ear drum cannot be pushed back and forth by them, instead vibrations cross through the depth of the ear drum in place. For very low sounds (less than about 8Hz) the natural springiness and damping of the ear drum pulls it back into place before the changing pressure can finish displacing it. In both cases the physical motion of the ear drum the nerves normally detect is absent. Sounds too low to hear can be felt as vibrations or even shaking, but sounds too high to hear are usually unable to be sensed by the body.

The octaves above a given note are called its harmonics. Harmonics are important mathematically because every shape of cyclic oscillation can be described as the combination of harmonic sine waves; they are important physically because most vibrating media will contain higher harmonics in lesser volume. They are also how we distinguish between chords and tones. Harmonics are beyond the scope of this post; if you are interested leave a comment and I’ll write them up in a later post.

Harmony

The psychology of what we perceive as pleasant musically is immensely complicated, but one element of it is pretty simple. Two notes will harmonize if the difference in their pitches is (nearly) a fraction with a small numerator and denominator. Thus 220Hz and 330Hz harmonize well (

3

2

) while 170Hz and 195Hz are much more dissonant (

39

34

). This applies likewise to chords of more than two notes. The ear does fudge things a bit, though, so 220Hz and 331Hz harmonize fairly well because they are “‍close to‍” a very nice harmony.

Just Scales

A scale is a set of pitches used to create music. They are very important for instruments like harps, less so for violins and trombones, but needed or not they are an element of every musical tradition of which I am aware.

The obvious thing to do when creating a scale is to pick notes that harmonize with one another. These tunings are called “‍just‍” scales. You pick a note (say 440Hz) and add to it notes that harmonize well (perhaps 660 and 880), and then notes that harmonize with those, and so on.

Just scales are problematic. Let’s take a simple example, where for each note we want the notes one octave up (×2 Hz) and down (÷2 Hz) as well as the “‍fifths‍”

2

3

and

3

2

. Starting 1Hz to make the math easy, what is the next higher note in the scale? We clearly have

3

2

, but we also have

2

1

×

2

3

=

4

3

, and

3

2

×

3

2

×

1

2

=

9

8

too. It doesn’t stop there; we’ll also need

256

243

, and

531441

524288

, and so on. In fact, these simple rules generate infinitely many notes between any two other notes.

The Pythagorean scale is a just scale based on exactly this simple rule, but only adds fifths and octaves a few times. Because of that, it has a different problem: instead of having infinitely many notes it contains many notes that have very few harmonies within the scale. You can’t change key within one Pythagorean scale; a new key requires a completely new scale.

Equally Tempered Scales

The main alternative to just scales are “‍equally-tempered‍” scales where the ratio of any two adjacent notes is constant. The common 12-tone scale of the piano keyboard, for example, has the ratio of the pitch of two keys x apart equal to 2^x/12.

A well-selected equally tempered scales end up almost in tune. 2^7/12 is almost the “‍perfect fifth‍” of

3

2

, differing from it only by 0.17%. Errors larger than that can come from moving your head while listening to music. An apprentice piano tuner told me he was being trained to hold his head still enough that he could actually hear the error; pressing the A bellow middle C and the E above it, the error sounded like a pulse in the chord that repeated every 1.35 seconds (0.74Hz).

This pulse deserves a bit more discussion. A well-tuned A pulses from high to low to high pressure 220 times a second. A well-tuned E does so 329.63 times a second. Sometimes the high pressure from the A and E will both reach your ear together, making a louder sound; and other times a high from one and a low from the other will cancel each other out, making a quieter sound. If E was 330 Hz, you’d hear this combined pressure pattern repeat at 110Hz: two cycles for the A, three for the E. But since the E isn’t quite 330Hz, it doesn’t quite keep up with the A and the two will slip into different arrangements. One moment you’ll have a pattern where two high pressures arrive at the same time every 110Hz, half a second later two low pressures will arrive together every 110Hz, and so on.

In any case, the 12-tone equally tempered scale is really close to being in tune. In a standard C-E-G major chord we’ve got 2^4/12 1% off from

5

4

, 2^3/12 1% off from

6

5

, and 2^7/12 0.2% off from

3

2

. Not every pair is quite so nice; 2^1/12 is 0.3% off from

17

16

(notice the large numbers in the fraction) and 2^6/12 isn’t very close to any reasonable fraction and is quite solidly discordant. It turns out that having the ability to create discord is quite handy, so the 12-tone equally-tempered scale works out very nicely.

What are we missing?

The job of a scale is to reduce the infinite array of pitches to a manageable finite set. Thus, for any scale we are intentionally excluding almost everything.

Some composers have intentionally used non-standard scales to explore some of these missing elements of our the 12-tone scale. The 31-tone equally-tempered scale has approximations of more harmonies, for example, as well as 2^22/31 which is more discordant than 2^6/12. An equally-tempered scale based dividing the golden ratio 1.6180339887… instead of dividing 2 would be in some sense “‍optimally discordant‍”.

For the most part, though, the 12-tone scale is “‍close enough‍” to most just scales that we can play the music we want to play. And when it’s not close, we can always switch to fretless strings, trombones, slide whistles, Ozark harps, jugs, kazoos, and the ever-versatile vocal chords.