Sunday, June 15, 2008

A sound theory?

[Here, because it will soon vanish behind a subscriber wall, is my latest Muse.for Nature News.]

A new theory suggests a natural basis for our preference for musical consonance. But does such a preference exist at all?

What was avant-garde yesterday is often blandly mainstream today. But this normalization doesn’t seem to have happened to experiments in atonalism in Western music. A century has passed since composer Arnold Schoenberg and his supporters rejected tonal organization, yet Schoenberg’s music is still considered by many to be ‘difficult’ at best, and a cacophony at worst.

Could this be because the dissonances characteristic of Schoenberg’s atonal compositions conflict with some fundamental human preference for consonance, embedded in the very way we perceive musical sound? That’s what his detractors have sometimes implied, and it might be inferred also from a new proposal for the origins of consonance and dissonance advanced in a paper by biomathematicians Inbal Shapira Lots and Lewi Stone of Tel Aviv University in Israel, published in the Journal of the Royal Society Interface [1].

Shapira Lots and Stone suggest that a preference for consonance may be hard-wired into the way we hear music. The reason that we prefer two simultaneous tones separated by a pitch interval of an octave or a fifth (seven semitones — the span from the notes C to G, say) rather than ‘dissonant’ intervals such as a tritone (C to F sharp, for instance) is that in the former cases, the ratio of frequencies of the two tones is a simple one: 1:2 for the octave, 2:3 for the fifth. This, the researchers argue, creates robust, synchronized firing of the neural circuits that register the tones.

One reading of this result (although it is one from which the authors hold back) is that Schoenberg’s programme was doomed from the outset because it contravenes a basic physiological mechanism that makes us crave consonance. The reality, however, is much more complicated, both in ways the authors acknowledge and in ways they do not.

Locked in harmony

Here’s the picture Shapira Lots and Stone propose. At the neural level, our response to different pitches seems to be governed by oscillators — either single neurons or small groups of them — that fire and produce an output signal when stimulated by an oscillatory input signal coming from the ear's cochlea. The frequency of the input is the acoustic frequency of the pitch that excites the cochlea, and firing happens when this matches the neural oscillator’s resonant frequency.

A harmonic interval of two simultaneous notes excites two such oscillators. What if they are coupled so that the activity of one can influence that of the other? By considering a biologically realistic form of coupling in which one oscillator can push the other towards the threshold stimulus needed to trigger firing, the researchers calculate that the two oscillators can become ‘mode-locked’ so that their firing patterns repeat with a fixed ratio of periodicities. When mode-locked, the neural responses reinforce each other, which can be deemed to provoke a stronger response to the acoustic stimulus.

Mode-locked synchronization can occur for any frequency ratios of the input signals, but it is particularly stable – the ratio of output frequencies stays constant over a particularly wide range of input frequencies – when the input signals have ratios close to small numbers, such as 1:1, 1:2, 2:3 or 3:4. These are precisely the frequency ratios of intervals deemed to be consonant: the octave, fifth, fourth (C to F), and so on. In other words, neural synchrony is especially easy to establish for these intervals.

In fact, the stability of synchrony, judged this way, mirrors the degree of consonance for all the intervals in the major and minor scales of Western music: the major sixth (C-A), major third (C-E) and minor third (C-E flat) are all slightly less stable than the fourth, and are followed, in decreasing order of stability, by the minor sixth (C-A flat), major second (C-D), major seventh (C-B) and minor seventh (C-B flat). One could interpret this as not only rationalizing conventional Western harmony but also supporting the very choice of note frequency ratios in the Western major and minor scales. Thus, the entire scheme of Western music becomes one with a ‘rational’ basis anchored in the physiology of pitch perception.

Natural music?

This is a very old idea. Pythagoras is credited (on the basis of scant evidence) as being the first to relate musical harmony to mathematics, when he noted that ‘pleasing’ intervals correspond to simple frequency ratios. Galileo echoed this idea when he said that these commensurate ratios are ones that do not “keep the ear drum in perpetual torment”.

However, there were some serious flaws in the tuning scheme derived from Pythagoras’s ratios. For one thing, it generated new notes indefinitely whenever tunes were transposed from one key to another – in essence, Pythagorean tuning assigns a different frequency to sharps and their corresponding flats (F sharp and G flat, say), and the result is a proliferation of finely graded notes. What’s more, the major third interval, which was deemed consonant by Galileo’s time, has a frequency ratio of 64:81, which is not particularly simple at all.

The frequency ratios of the various intervals were simplified in the sixteenth century by the Italian composer Giuseppe Zarlino (he defined a major third as having a 4:5 ratio, for example), and the resulting scheme of ‘just intonation’ solved some of the problems with Pythagorean tuning. But the problem of transposition was not fully solved until the introduction of equal temperament, beginning in earnest from around the eighteenth century, which divides the octave into twelve equal pitch steps, called semitones. The differences in frequency ratio between Pythagorean, just and equal-tempered intonation are very small for some intervals, but significant for others (such as the major third). Some people claim that, once you’ve heard the older schemes, equal temperament sounds jarringly off-key.

In any event, the mathematical and physiological bases of consonance continued to be debated. In the eighteenth century, the French composer Jean-Philippe Rameau rooted musical harmony instead in the ‘harmonic series’ — the series of overtones, with integer multiples of the fundamental frequency, that sound in notes played on any instrument. And the German physiologist Hermann von Helmholtz argued in the nineteenth century that dissonance is the result of ‘beats’: the interference between two acoustic waves of slightly different frequency. If this difference is very small, beats are heard as a periodic rise and fall in the volume of the sound. But as the frequency difference increases, the beating gets faster, and when it exceeds about 20 hertz it instead creates an unpleasant, rattling sensation called roughness. Because real musical notes are complex mixtures of many overtones, there are several potential pairs of slightly detuned tones for any two-note chord. Helmholtz showed that beat-induced roughness is small for consonant intervals of such complex tones, but larger for dissonant intervals.

Shapira Lots and Stone argue rightly that their explanation for consonance can explain some aspects that Helmholtz’s cannot. But the reverse is true too: modern versions of Helmholtz’s theory can account for why the perception of roughness depends on absolute as well as relative pitch frequencies, so that even allegedly consonant intervals sound gruff when played in lower registers.

Good vibrations

There are more important reasons why the new work falls short of providing a full account of consonance and dissonance. For one thing, these terms have more than a single meaning. When Shapira Lots and Stone talk of ‘musical dissonance’, they actually mean what is known in music cognition as ‘sensory dissonance’ – the sensation of roughness. Musical dissonance is something else, and a matter of mere convention. As I say, the major third interval that now seems so pleasing to us was not recognized as consonant until the Renaissance, and only the octave was deemed consonant before the ninth century. And sensory dissonance is itself a poor guide to what people will judge to be pleasing. It's not clear, for example, that the fourth is actually perceived as more consonant than the major third [2]. And the music of Ravel and Debussy is full of ‘dissonant’ sixths, major sevenths and ninths that now seem rather lush and soothing.

But fundamentally, it isn’t clear that we really do have an intrinsic systematic preference for consonance. This is commonly regarded as uncontentious, but that’s far from true. It is certainly the case, as Shapira Lots and Stone say, that the musical systems of most cultures are based around the octave, and that intervals of a fifth are widespread too. But it’s hard to generalize beyond this. The slendro scale of Indonesian gamelan music, for instance, divides the octave into five roughly equal and somewhat variable pitch steps, with none of the resulting intervals corresponding to small-number frequency ratios.

Claims that infants prefer consonant intervals over dissonant ones [3] are complicated by the possibility of cultural conditioning. Babies can hear and respond to sound even in the womb, and they have a phenomenal capacity to assimilate patterns and regularities in their environment. A sceptical reading of experiments on infants and primates might acknowledge some evidence that both the octave and the fifth are privileged, but nothing more [4]. My guess is that the ‘neural synchrony’ argument, of which Shapira Lots and Stone offer the latest instalment, is on to something, but that harmony in Western music will turn out to lean more heavily on nurture than on nature.


1. Shapira Lots, I. and Stone, L. J. R. Soc. Interface doi:10.1098/rsif.2008/0143
2. Krumhansl, C. L. Cognitive Foundations of Musical Pitch (Oxford University Press, 1990).
3. Schellenberg, E. G. and Trehub, S. E. Psychol. Sci. 7, 272–277 (1996).
4. Patel, A. Music, Language, and the Brain (Oxford University Press, 2008).


Philip Dorrell said...

Philip, are you aware of The Statistical Structure of Human Speech Sounds Predicts Musical Universals? This paper suggests that our perception of consonance is a function of our exposure to the intervals between harmonics of speech vowel sounds.

This is consistent with my own theory of consonance which states that the purpose of consonance perception is to calibrate relative pitch perception, i.e. our ability to perceive a four-way relationship between pairs of notes that are separated by the same interval.

(Schwartz, Howe and Purves ignore this possibility, because they assume relative pitch perception before performing their statistical analysis, i.e. they analyse frequency distributions of intervals as frequency ratios, rather than intervals as pairs of frequencies.)

Unknown said...

When the slendro scale is conceptualized as 5-EDO, it does actually contain some simple freq. ratios. Tone I to tone II would be about 8:7, I to III 4:3 (our P4), I to IV 3:2 (our P5), and I to V 7:4.

These approximations have errors of only 3.7%, 7.5%, 7.5%, and 3.7% with respect to the difference between consecutive tones (the equivalent of 3.7 or 7.5 cents in 12-EDO).

We can compare this to 12-EDO, which offers very nice ratios in the P5/P4 and okay ratios in M2/m7 (9:8, 16:9)...but its other intervals have no small ratio approximations.

Arguably, slendro is inherently less dissonant than the Western scale!