You are in: Home » Research at the Unit » Speech and Language » Hearing
Hearing
People at the CBU currently working in hearing research :
What is it about?
When you are trying to listen to someone in a crowded room, your brain has to perform a number of quite demanding tasks:
- The outputs of the early frequency analyses performed in the two inner ears must be "sorted", so that the frequency components arising from the target voice are grouped together
- The target voice must be tracked over time
- Decisions must be made on how to interpret "missing data", such as when part of the speech is masked by an extraneous noise
- Attentional mechanisms must select the target voice for further processing
- Linguistic analyses must be performed on the selected voice.
At the CBU we study all of these processes, with the focus on how they interact with each other. To do so, we combine traditional behavioural methods, such as those derived from psychophysics, with brain imaging using fMRI and EEG. Our research combines the expertise of scientists trained in auditory science, speech perception, neuro-imaging, and attention. A topic of particular interest concerns the ability of patients fitted with a cochlear implant to understand speech in noisy situations.
Click on the following links to learn more about some of our recent projects, and to hear some sound demos:
- Separating sounds using pitch differences
- Auditory streaming and attention
- Listening through a cochlear implant
- The continuity illusion
- Software
Separating sounds using pitch differences
One of the most important cues that we use to separate competing sounds is the difference in pitch between talkers. For example, in the following sentence one can listen either to the (low-pitched) male talker or to the (high-pitched) female talker. [click to listen
] Note that you can perform this task even though the recording is in mono. Hence, although it helps if the two talkers are in different locations, this is not necessary - fortunately for those of us who listen to the radio or T.V. without the benefit of a hi-fi system.
The sentences in the above example tap into two processes.
- Because the two voices do sometimes overlap in time, the auditory system must decide which frequency components of the input at any given time belong to the same voice. This can be illustrated in another example, where we manipulate a perfectly harmonic complex by mistuning one of its components, causing it to be heard out as a separate "whistle".
- The target voice must be pulled out into a separate "auditory stream"
Our research in this area consists of three themes
- We study basic processes of pitch perception. For example, we have shown that the mechanism responsible for encoding the pitch of high-numbered harmonics is "sluggish", being very poor at tracking even quite slow pitch changes [1-3]. We have also argued that this mechanism is fundamentally different from that used for lower-numbered harmonics..
- We are using fMRI and behavioural measures to examine whether different neural pathways encode pitch and location
- We are studying how cochlear implant users listening through a cochlear implant perceive pitch, and how they might use pitch differences to segregate competing voices.
Auditory streaming and attention
One relatively simple situation occurs when the sounds to be separated are interleaved in time. For example, in polyphonic music, two different melodies can be played concurrently, either by interleaving notes from two different instruments, or even from the same instrument playing in two different pitch ranges. Under such circumstances, the auditory system breaks the two melodies into separate "streams". Auditory streaming is also useful when understanding two competing voices, as, fortunately, the words spoken by two competing talkers do not completely overlap in time.
We use a rather simple stimulus developed by van Noorden, consisting of a repeating triplet of tones.
When all the tones have similar frequencies, you can hear a single stream of sound, with a galloping rhythm:

But when the "A" and "B" tones are far apart, you hear 2 streams and the gallop disappears
Although researchers agree on what differences can cause sounds to stream apart, there is fierce debate about where in the brain this streaming occurs. We have shown that the streaming is dependent on attention, and must therefore involve cortical mechanisms. We are currently using EEG to further determine how early in neural processing the effect occurs
Listening through a cochlear implant
The cochlear implant is the world's most successful sensory prosthesis. The details of the devices differ somewhat between manufacturers, but all modern implants share a number of common features, which are shown schematically in the next figure:

Top left: The components of a cochlear implant. In modern devices, the speech processor is housed in a smaller casing worn behind the ear. Top right: The electrode array inserted in the inner ear. The device shown is manufactured by the Cochlear Corporation, but the same basic components are found in most other devices.
- Sound is picked up by a microphone worn behind the ear, and the analogue waveform passed to an external speech processor.
- The processor determines the pattern of stimulation to be applied to the electrodes in the inner ear.
- The output of the speech processor modulates a radio-frequency carrier, which is transmitted across the skin using a coil, and decoded by a receiver-stimulator implanted under the skin.
- The receiver-stimulator sends electric signals to each of several electrodes implanted in the cochlea. The electrodes are arranged along the length of the basilar membrane. The higher frequency bands of the input waveform control stimulation of electrodes near the base of the cochlea, which would normally respond most to high-frequency sounds. The lower-frequency bands are conveyed by electrodes nearer the apex, which normally responds to lower-frequency sounds.
Cochlear implant users face at least two problems when trying to listen to one voice in the presence of another:
One problem is that existing speech processors do a poor job of encoding the pitch of sounds. This can be illustrated by listening to a simulation of speech as heard through a cochlear implant. The simulation mimics the processing carried out by one popular type of speech processor, and presents the output of the processor in acoustic form. First, listen to this single sentence, spoken normally [click to listen], and then passed through the simulator [click to listen]. Although the simulation sounds unnatural, one can learn to pick out what is being said. In normal hearing, it is possible to use pitch differences to listen to one voice in a mixture, such as the male voice in this example [click to listen]. However, when the mixture is processed using the simulation [click to listen] , the task becomes almost impossible. We are studying the basic processes of pitch perception by cochlear implant users, and are investigating ways in which the separation of competing sounds could be improved in this form of hearing.
A second problem is that most users only receive one implant. Normally hearing listeners can take advantage of differences in location between competing talkers. One way they can do this by attending to the ear which contains the better signal-to-noise ratio (e.g. if the attended talker is on the left, one can attend to the left ear). Recent trials in which patients receive two implants show that they can take advantage of this cue. However, the normal auditory system also uses another cue, based on differences in the time of arrival of sounds at the two ears. The neural mechanism that processes this cue depends crucially on input coming from "matched" sets of fibres, which innervate the same part of the basilar membrane in each ear. In contrast, bilateral implants are usually fitted independently to each ear, so the same frequency region of speech may be transmitted by electrodes situated at quite different parts of the basilar membrane. We are investigating new ways of matching the electrodes in the two ears, and also of improving the coding of fine timing information by cochlear implant speech processors.
The continuity illusion
When a sound is turned off briefly and then on again, it can be perceived as continuous when the silent gap is filled by an "inducing" sound, such as a burst of noise. This "continuity illusion" only occurs if the frequency content and timing of the noise is such that it could have plausibly masked the sound had it remained uninterrupted. It is important for the perception of stimuli such as speech in noisy environments; for example, if a sound such as /i/ is masked mid-way through by a brief sound, it would not make sense to interpret it as two separate phonemes separated by a gap.
One way in which we are using the continuity illusion is to shed light on other auditory processes. For example, we have shown , using two-formant vowels [5] , that when the formants alternate in time, vowel identification is very poor.

Left: schematic spectrogram of a two-formant vowel, approximating /i/ (as in "feet"). The vowel is pulsed on and off.
Right: The same vowel, but with the two formants alternating in time
However, when filtered noise is inserted between the silent gaps, the stimulus sounds continuous and identification is easier. This means that it is not necessary for formants to be physically simultaneous in order for them to be integrated into a vowel percept. It also suggests that the continuity illusion observed here occurs at or before the stage of neural processing responsible for vowel identification. this prediction was recently confirmed by our finding that the Middle Temporal Gyrus, which responds preferentially to speech sounds, is also preferntially activated by our "illusory" vowel stimuli [6].
In addition, we are investigating the neurophysiological basis for the illusion. We have shown [6] that the illusion can be reflected in an EEG wave known as the mismatch negativity. Because the MMN occurs with a latency of less than 200 ms, and originates from auditory cortex, this demonstrates that the illusion occurs at a fairly early stage of processing. In our study, subjects were instructed to ignore the tones and to watch a silent video. Although this does not show that the continuity illusion is completely unaffected by attention, we can conclude that it occurs when subjects are not focussing on the sounds.
Basilar Membrane. A membrane in the inner ear that vibrates in response to sound. The basal end is thin and stiff and vibrates most to high-frequency sounds. The apex is wide and less stiff, and vibrates most to low-frequency sounds.
References
- Micheyl, C. and R.P. Carlyon, Effects of temporal fringes on fundamental-frequency discrimination. J. Acoust. Soc. Am., 1998. 104: p. 3006-3018.
- Carlyon, R.P., B.C.J. Moore, and C. Micheyl, The effect of modulation rate on the detection of frequency modulation and mistuning of complex tones. J. Acoust. Soc. Am., 2000. 108: p. 304-315.
- Plack, C.J. and R.P. Carlyon, Differences in frequency modulation detection and fundamental frequency discrimination between complex tones consisting of resolved and unresolved harmonics. J. Acoust. Soc. Am., 1995. 98: p. 1355-1364.
- Carlyon, R.P., et al., Effects of attention and unilateral neglect on auditory stream segregation. Journal of Experimental Psychology: Human Perception and Performance, 2001. 27: p. 115-127.
- Carlyon, R.P., et al., The Continuity Illusion and Vowel Identification. Acta Acustica united with Acustica, 2002. 88: p. 408-415.
- Micheyl, C., et al., Neurophysiological correlates of a perceptual illusion: A Mismatch Negativity Study. J. Cog. Neurosci. 15, 2003: p 747-758.

