Hearing Lecture Notes (5): Binaural hearing and
localization
Possible cues to localization of a sound:
(only applies to azimuth , ie localization in horizontal plane)
The time/intensity trade is shown by titrating a phase difference in one direction against an intensity difference in the other direction.
Binaural cues are inherently ambiguous. The same differences can be produced
by a sound anywhere on the surface of an imaginary cone whose tip is in the ear.
For pure tones this ambiguity can only be resolved by head movements. But
for complex tones the ambiguity can be resolved by the effects of the
pinna.
As with pure tones, onset time cues are important for (particularly short)
complex tones. But the use of other timing cues is different since high
frequency complex tones can change in localization with an ongoing
timing difference. The next diagram shows the output of an auditory filter at
1600 Hz to a complex tone with a fundamental of 200 Hz. The 1400, 1600 and 1800
Hz components of the complex pass through the filter and add together to give
the complex wave shown in the diagram. The complex wave has an envelope
that repeats at 200 Hz. Phase differences would not change the localization of
any of those tones if they were heard individually, but we can localize sounds
by the relative timing of the envelopes in the two ears (provided that the
fundamental frequency (envelope frequency) is less than about 400 Hz).
Output of 1600 Hz filter to complex tone with a 200Hz fundamental
frequency; right ear leading by 500 us.

(mainly median plane i.e from front to back via overhead)
The pinna
reflects high frequency sound (wavelength less than the dimensions of the outer
ear) with echoes whose latency varies with direction (Batteau). Reflections
cause echoes which interfere with other echoes/direct sound to give spectral
peaks and notches. Frequency of peaks and notches varies with direction of
sound and are used to indicate direction in median plane. One of the cell-types
in the Dorsal Cochlear Nuceus (DCN) may be specialised for detecting the
spectral notches created by the pinna.
There is considerable interest at the moment regarding ways to improve the
stereo imaging of audio reproductions. The following recordings (from the
Sennheiser laboratories) were made by placing a microphone in each ear of an
artificial head. This technique allows the modifications produced by the pinna
(or external ear) to be recorded. The pinnae are very important in helping us to
localise sounds in the median plane. When these recordings are presented in
stereo over headphones the sounds seem more "external" and realistic than
conventional recordings heard under these conditions.
Listen to two
recordings made with a dummy-head (doesn't work from PCs):
a plane flying
overhead
people talking
2.3 Head movements
Head movements can resolve the ambiguity of front-back confusions. But of
course they don't work well for short sounds!
A distant sound will be quieter, have less energy at high frequencies and have relatively more reverberation than a close sound. Increasing the proportion of reverberant sound leads to greater apparent distance. Lowpass filtering also leads to greater apparent distance; high frequencies are absorbed more by water vapour in air (by up to about 3 dB/100 ft). If you know what a sound is, then you can use its actual timbre to tell its distance relative to another sound. If you don't know what a sound is, you can use the change in loudness as you walk towards it to judge its distance (provided it is not too far away).
Seen location easily dominates over heard location when the two are in conflict (the ventriloquism effect).
In an echoic environment the first wavefront to reach a listener indicates
the direction of the source. The brain suppresses directional information from
similar, immediately subsequent sounds (which are likely to be
echoes).
Since echoes come from different directions than the main sound,
they may be ignored more easily with two ears.
A number of psychoacoustic phenomena demonstrate that we are only binaurally sensitive to the phase of a pure tone if its frequency is less than about 1.5 kHz. These are:
Fluctuations in intensity and/or localisation when two different tones one to
each ear (e.g. 500 + 504 Hz gives a beat at 4 Hz). Only works for low frequency
tones < 1.5 kHz because phase-locking is necessary for them to be heard.
Compare binaural beats, where the sounds only come together in the brain
and so the beating arisies neurally, with monaural beats where the sounds mix
physically before entering the ear. Monaural beats are heard at all frequencies,
binaural beats only at low frequencies.
When the same tone in noise is played to both ears, the tone is harder to
detect than when one ear either does not get the tone, or has the tone at a
different phase. Magnitude of effect declines above about 1 kHz, as
phase-locking breaks down. Explained by Durlach's Equalization and
Cancellation model.
Here is a demonstration of the
Binaural Masking Level Difference:
You will hear a 500-Hz signal that
lasts 100ms played against a background of white noise. The signal is played ten
times getting 10 dB quieter each time.
In the first sound, the
two ears each get identical signals (giving you a single image in the middle of
your head).
Count how many sounds you can hear.
You should hear about 4, unless the room is very
noisy.
In the second sound, the noise remains the same, but the
phase of the signal in one ear has been changed by 180 degrees. So when the
signal is positive in one ear, it is negative in the other.
Count how many
sounds you can hear.
You should hear more than with the first sound, and you
may find that, although the noise stays in the middle of the head, the signal
appears to come more from one side.
The Durlach model can explain this
result by assuming that the brain can subtract the signals at the two ears:
With the first sound, subtracting (or adding the two ears) is no help in separating the signal from the noise.
subtraction: (N+S) - (N+S) = 0; addition: (N+S)+(N+S) = 2(N+S)
But with the second sound, the noise is identical at the two ears so it cancels out, but since the signal is +sine in one ear and -sine in the other, subtraction gives double the intensity.subtraction: (N+S) - (N-S) = 2S
The process does not work perfectly, since there is internal noise, and some failure in phase-locking (which fails more at higher signal frequencies, so reducing the effect).
If noise is fed to one ear and the same noise to the other ear but with the
phase changed in a narrow band of frequencies, subjects hear a pitch sensation
at the frequency of the band. Pitch gets rapidly less clear above 1500 Hz. (NB
Can be explained by models of the BMLD effect if you think of the phase-shifted
band as the 'tone').