Differential representation of speech sounds in the human cerebral hemispheres



Various methods in auditory neuroscience have been used to gain knowledge about the structure and function of the human auditory cortical system. Regardless of method, hemispheric differences are evident in the normal processing of speech sounds. This review article, augmented by the authors' own work, provides evidence that asymmetries exist in both cortical and subcortical structures of the human auditory system. Asymmetries are affected by stimulus type, for example, hemispheric activation patterns have been shown to change from right to left cortex as stimuli change from speech to nonspeech. In addition, the presence of noise has differential effects on the contribution of the two hemispheres. Modifications of typical asymmetric cortical patterns occur when pathology is present, as in hearing loss or tinnitus. We show that in response to speech sounds, individuals with unilateral hearing loss lose the normal asymmetric pattern due to both a decrease in contralateral hemispheric activity and an increase in the ipsilateral hemisphere. These studies demonstrate the utility of modern neuroimaging techniques in functional investigations of the human auditory system. Neuroimaging techniques may provide additional insight as to how the cortical auditory pathways change with experience, including sound deprivation (e.g., hearing loss) and sound experience (e.g., training). Such investigations may explain why some populations appear to be more vulnerable to changes in hemispheric symmetry such as children with learning problems and the elderly. Anat Rec Part A, 2006. © 2006 Wiley-Liss, Inc.

Sound can be characterized by three physical parameters: frequency, starting phase, and amplitude. Thus, acoustic stimuli provide spectral, temporal, and intensity cues that can be used for communication (e.g., speech), safety (e.g., a car horn), and pleasure (e.g., listening to music). In the auditory cortex, these cues are represented by cortical neural activity and ultimately linked to perceptual performance (Phillips,1993). Although the perception of speech sounds can be assessed with behavioral measures in some populations, much less is known about the neurophysiology underlying speech encoding in the central auditory system. It is generally accepted that the primary auditory cortex lies deep within the lateral Sylvian fissure on the transverse gyrus of Heschl (Brodmann's area 41) (Brodmann,1909) and that it is involved with speech processing. The secondary auditory cortex, or association cortex, lies in surrounding anatomic regions of the superior temporal gyrus (Brodmann's areas 21, 22, 42, and 52) (Brodmann,1909; Celesia,1976; Talairach and Tournoux,1988) and is also implicated in the processing of sound. Speech perception occurs through an anatomical network that consists of the temporal lobe, including planum polare, transverse temporal gyrus, and planum temporale (Fig. 1). However, each region's precise contribution to the process is not completely understood.

Figure 1.

Superior aspect of left temporal gyrus: planum polare (green), anterior and posterior transverse temporal gyri (purple), planum temporale (red). Adapted with permission from Duvernoy (1999).

Input to the central auditory system comes from both ears. At all levels central to where the cochlear nerve enters the brainstem, speech cues are represented bilaterally. The superior olivary complex plays a major role in binaural hearing and is the first site of combined information that arrives from the cochlear nuclei (Brugge,1992). At this point onward, there are crossed and uncrossed fibers that extend between the nuclei of the superior olivary complex, trapezoid body, lateral lemnisci, and inferior colliculi that contribute to the redistribution of auditory information (Clark,1975).

The complexity of the parallel and crossed fiber tracts, ascending and descending pathways, multiple subcortical nuclei, and primary and secondary auditory cortices makes following the central representation of auditory signals difficult, particularly in humans. Much of our knowledge about structure and function has been inferred from studies of the central auditory system in animals, which bears some similarities to the central auditory pathway in humans. Recently, results from neuroimaging studies have provided additional insight into the organization and function of the human auditory cortex and its relation to the processing of speech stimuli. This review will highlight some of the relationships between regional brain activity and auditory function illustrated with various methods in auditory neuroscience, including recent neuroimaging techniques. In particular, this review will address right and left hemispheric activation patterns and the structural and functional implications of cortical asymmetry, bearing in mind the binaural nature of the normal auditory system.


Our present knowledge of the structure and function of the central auditory system in humans stems from a combination of studies that employ behavioral, anatomical (e.g., cytoarchitecture), electrophysiologic, and neuroimaging measures. Early researchers measured brain activity through direct recording of electrical events from the auditory nerve (Wever and Bray,1930) and the brain (Woolsey and Walzl,1942). Single neuron recordings using microelectrode techniques (Galambos and Davis,1943) broadened the understanding of neuronal encoding mechanisms, but were limited to certain experimental conditions, primarily in animal models. Subsequent studies using noninvasive electrophysiologic recordings (e.g., brainstem to cortical responses) via surface electrodes were applied in humans. In the past decade, however, there have been further advancements in methodologies to study the cortical auditory system in vivo in the human brain using multielectrode electrophysiology and neuroimaging techniques (Kwong et al.,1992; Ogawa et al.,1992). Although it is beyond the scope of this article to describe in detail the methods used in auditory neuroscience, a brief summary is provided to facilitate understanding of the research studies discussed. In addition, the subset of studies reviewed includes those with an emphasis on the acoustic features of the speech stimulus rather than linguistic aspects. Finally, it is essential to bear in mind that neuroimaging techniques, such as functional magnetic resonance imaging (fMRI), are still evolving. As these techniques are refined, each will impact the design of research questions in the field of auditory neuroscience in humans and the eventual application of findings to clinical populations.


Electrophysiologic methods, such as electroencephalography (EEG, i.e., recording electrical potentials from the scalp) and evoked potentials (i.e., electrical potentials evoked by an external stimulus), allow the evaluation of responses that arise from a population of individual neurons. The ability to record these evoked responses is inherently dependent on neural synchrony and is influenced greatly by stimulus parameters (Picton et al.,1974). Magnetoencephalography (MEG) measures the magnetic field activity that is associated with intracellular ionic current flow in the brain (Hari,1993; Lounasmaa et al.,1993). An advantage of EEG and MEG is the excellent temporal resolution, which is on the order of tens of milliseconds. A limitation of surface-recorded electrophysiologic responses, however, is the inability to differentiate specific anatomical structures. This makes it difficult to know the exact generator sites or the extent of cortical areas involved. In addition, magnetic field recordings are insensitive to currents that are oriented in the radial direction and therefore reflect only tangentially directed currents. Although modeling methods (e.g., dipole source reconstruction) can be used to analyze and derive the underlying sources of surface electrode recordings, questions remain as to the exact modeling assumptions to be used in these analyses (Scherg and Picton,1991; Pascual-Marqui,1999).

PET and fMRI

Neuroimaging techniques are based on the principle that neuronal activity requires energy. Increased energy demands are reflected in increased blood flow and metabolism, and the resulting changes can be visualized in response to a task-dependent activity. With the introduction of positron emission tomography (PET), it became possible to conduct in vivo experiments in human subjects (Frackowiak et al.,1980; Phelps et al.,1981; Raichle et al.,1983). PET techniques rely on the distribution of positron-emitting radioactive isotopes in order to detect biochemical properties that enter and therefore trace physiological processes. Changes in blood flow and glucose metabolism associated with the delivery of auditory stimuli can be reconstructed based on the decay of radioisotopes. PET can be conducted in quiet test environments and is not susceptible to the scanner noise that occurs in the MRI environment. A disadvantage of PET is the need for radioactive injections, which are considered invasive and therefore not acceptable for certain populations (e.g., children).

Functional magnetic resonance imaging (fMRI) detects increased oxygenation levels in response to the stimulus or test parameter. The blood oxygenation level-dependent (BOLD) response was originally described in rat experiments by Ogawa et al. (1990) and subsequently used in human studies of the brain (Kwong et al.,1992; Ogawa et al.,1992). fMRI has become more commonly used than PET for experimental purposes because of its widespread availability and because it does not require exposure to ionizing radiation or radiopharmaceuticals. In studies that employ fMRI with auditory stimuli, one complicating factor is the intensity of the scanner noise that occurs as the magnet changes gradient fields (Bandettini et al.,1998; Ulmer et al.,1998). Noise can be reduced by earphone specifications and, more importantly, by synchronizing the stimuli with the MR pulse sequences so that the auditory stimuli are presented during a quiet period (Edmister et al.,1999; Hall et al.,1999).

Both PET and fMRI responses are considered indirect measures of brain activity. The two techniques provide good spatial resolution, with the best spatial resolution obtained with fMRI (approximately 3–5, 2, and 1 mm for 1.5, 3.0, and 7.0 Tesla, respectively) compared to the resolution with PET (approximately 6–10 mm). However, the temporal resolution is poor with both methods because of the slow time course of the hemodynamic response itself, which takes seconds rather than milliseconds. Anatomical images can be easily obtained and combined with neuroimaging data to provide detail of the auditory cortex in each subject. In this way, it is possible to localize maximally activated regions and to determine how these patterns correlate with anatomy and function. Table 1 provides a summary of the temporal and spatial resolution characteristics of PET, fMRI, MEG, and EEG.

Table 1. Methods of study in auditory neuroscience
MethodResponse MechanismTemporal Res.Spatial Res.
EEGElectrical potentials measured by multiple surface electrodes< 1 ms8–14 mm (dep # channels)
MEGElectrical currents measured by superconducting coils< 1 ms8–14 mm (dep # channels)
PETCerebral metabolic changes measured by radioactive tracers> 2 min6–10 mm
fMRICerebral metabolic changes measured by changes in magnetic properties depending on blood oxygen levels0.25–3 sec3–5 mm (1.5T)
2 mm (3T)
   < 1 mm (7T)

Improved knowledge of auditory cortex organization and function in humans can be achieved by combining electrophysiologic, neuroimaging, and behavioral responses obtained from the same experimental subject. This would allow for the acquisition and analysis of temporal and spatial information from the same set of neural responses. Such simultaneous recordings have been made in the visual system (Bonmasser et al.,1999) and only recently in the auditory system (Liebenthal et al.,2003; Scarff et al.,2004) in humans. In order to combine temporal resolution with spatial activation, new methods will be required to merge data that rely on different experimental designs, data analyses, and physiologic mechanisms (e.g., instantaneous neural activity but poor localizing power compared to slow hemodynamic changes with high spatial power) (Wagner and Fuchs,2001).

In the following sections, data collected by the authors using fMRI methods with either normal-hearing subjects or unilateral hearing loss are presented. In these experiments, auditory stimuli were tailored for each study and delivered with Avotech electrostatic headphones. A sparse sampling paradigm (Hall et al.,1999) was employed to record cortical responses with a 9.3-sec quiet period between MR acquisitions and a 30-sec on-off paradigm for a total duration of 5.5 min. Data were collected on a 1.5 Tesla GE Signa CV MR scanner using a standard quadrature head coil. Ten slices of T1-weighted anatomical images (TE = 4.2 msec, TR = 265 msec, flip angle = 80°) were acquired as a template for the functional MR images in the axial direction with a thickness of 5 mm (FOV = 240 × 240 mm, resolution 256 × 256 points), including the primary auditory cortex and surrounding area. Functional MRI data were collected as gradient echo EPI sequences (TE = 40 msec, TR = 700 msec, flip angle = 90°) collecting 34 volumes (resolution 64 × 64 points). At the end of the scanning session, a high-resolution anatomic T1-weighted SPGR image was recorded (TE = 3 msec, TR = 25 msec, flip angle = 30°) with 124 slices in the sagittal direction with a thickness of 1.3 mm (FOV = 240 × 240 mm, resolution 256 × 256 points) covering the whole head. Data analysis was completed using AFNI (Cox,1996) and included motion correction, cross-correlation analysis, and spatial clustering algorithms. Ideal cluster size, minimum voxel distance, and correlation threshold (correlation coefficient > 0.40) were determined by Monte-Carlo analysis for a given region of interest (ROI; specifically voxels within transverse temporal gyrus and planum temporale for the left and right hemisphere), size, and spatial correlation.


Acoustic stimuli that are used to evoke either neural or vascular responses can be described in terms of frequency, level or intensity, and time. There can be complex interactions among stimulus factors themselves, for example, the duration of a stimulus is closely related to its frequency and presentation rate. Changes in stimulus parameters are known to influence electrophysiologic responses. For example, an increase in stimulus intensity typically results in an increase in the magnitude of the electrophysiologic response. Responses of the central auditory system also are affected by the listening condition, such as whether the stimuli are presented monaurally or binaurally. In addition, properties of the stimulus can interact with subject characteristics, such as age and auditory pathology. The effects of acoustic stimulus properties on auditory cortical activation using neuroimaging techniques have not been thoroughly explored. Likewise, the optimal properties of acoustic stimuli to evoke spatially specific responses are not fully known.

Stimulus Frequency

The auditory system, from the basilar membrane of the cochlea to the auditory cortex, is tonotopically organized. Neurons that are most sensitive to similar frequencies tend to be located near each other so that there is an orderly spatial representation of neurons with varying best frequencies throughout the auditory pathways. Human hemodynamic responses to binaurally presented 500 and 4000 Hz tonal stimuli at 100 dB SPL showed that high-frequency stimuli were more effective in activating centers in more frontal and medial locations within the temporal lobe than low-frequency stimuli (Bilecen et al.,1998). For both high- and low-frequency stimulation, the activation was also reported to be greater in the left compared to the right hemisphere. A comparison of responses to 1000 and 4000 Hz tonal stimuli presented to the right ear of subjects resulted in greater activation laterally in the transverse temporal gyrus for the 1000 Hz stimulus, whereas the 4000 Hz stimulus activated the medial location (Strainer et al.,1997). These findings are generally consistent with earlier reports in humans, such as those by Pantev et al. (1995), who used MEG to compare responses to 500, 1000, and 4000 Hz at 60 nHL (normative hearing level), and both Lauter et al. (1985) and Lockwood et al. (1999b), who used PET to compare responses to 500 and 4000 Hz tone bursts. In Figure 2, a comparison of fMRI activation in response to low- and high-frequency stimuli in one of our normal-hearing subjects is shown.

Figure 2.

Each row of axial slices displays fMRI activation for a pure tone (1.5% warble, 1.5 Hz pulsed) at 80 dB HL to the left ear at two locations through the auditory cortex overlaid onto a T2-weighted MR image. Top row shows responses for a 500 Hz stimulus and bottom row for a 4000 Hz stimulus. Center of activation for 500 Hz is more lateral, center of activation for 4000 Hz is more medial along the transverse temporal gyrus. Axial slices shown in radiological orientation (right hemisphere to the left, left hemisphere to the right). The green cross-hairs approximate the center of mass for the response on the right hemisphere.

A microelectrode placed in the right side of Heschl's gyrus of an epileptic patient provided a unique opportunity to record single unit data from a human using tones of 24 different frequencies and an intensity level of 75 dB SPL presented monaurally in the left ear (Howard et al.,1996). The best-frequency responses of cortical units indicated that responses to higher best frequencies (i.e., 3360 Hz) were located more posterior-medial and responses to lower best frequencies (i.e., 1489 Hz) more anterior-lateral. (Although these auditory cortex units responded in a frequency-specific manner, complex temporal processing was occurring in parallel.)

A neuroimaging study in humans by Talavage et al. (2000) has identified multiple frequency response regions in the auditory cortex. Six subjects were stimulated binaurally with lower (i.e., less than 660 Hz) and higher (i.e., greater than 2490 Hz) frequency pairs of narrowband stimuli. Each pair was separated by two octaves so that the spatial representation of each member of the pair was approximately 6 mm and therefore differentiable with neuroimaging methods. The majority of stimuli were presented 35 dB above behavioral threshold, which was determined for each subject while in the scanner but in the absence of scanner noise. (The behavioral thresholds obtained in the scanner for the subjects were not reported.) Eight frequency-dependent response regions were identified on the superior temporal lobe, four of which were greater for high-frequency signals and four for low-frequency signals. In a previous study by the same researchers (Talavage et al.,1997), which used frequency sweeps (i.e., center frequency of a narrow band noise swept from low to high or high to low) as stimuli, the progression of cortical activation complemented the frequency specificity in the later study (Talavage et al.,2000) by connecting seven of the eight identified frequency response regions. These studies suggest that multiple tonotopic activation patterns exist across the auditory cortex in humans, similar to those documented in animal models (Reale and Imig,1980; Rauschecker et al.,1997).

Little is known about the impact that hearing loss has on tonotopic organization in the human auditory cortex. Animal data indicate that tonotopic organization is disturbed in the presence of hearing impairment (Kaas et al.,1983; Robertson and Irvine,1989; Harrison et al.,1991). Restricted cochlear damage results in significant reorganization of the representation of frequency in the auditory cortex and an expansion of frequency representation into the regions located near the deprived area (Robertson and Irvine,1989; Rajan et al.,1993). Although peripheral damage can alter the cortical spatial representation of frequency in animals, corresponding studies in human auditory cortex have not been conducted. At this time, the functional significance of tonotopic organization is not completely understood.

Stimulus Level

The effects of stimulus level or intensity of the acoustic signal on cortical activation using neuroimaging have been investigated in a few studies with variable results. Jäncke et al. (1998) used pure tones and consonant-vowel-consonant speech stimuli presented binaurally at 75, 85, and 95 dB SPL. The results suggested that activation was greater in the left hemisphere compared to the right in Brodmann's area 22 for speech stimuli at the higher levels of 85 and 95 dB compared to 75 SPL. Strainer et al. (1997) evaluated two intensity levels, that of 20 and 50 dB SL (i.e., sensation level), and showed an increase in the volume of activation at the higher intensity when imaging the primary auditory cortex. These studies demonstrate an increase in the cortical response with increase in intensity level (Fig. 3), although the specifics of the loudness growth function have not been fully addressed.

Figure 3.

fMRI activation for two stimulus levels presented above auditory threshold for a 1 kHz pure tone stimulus. The top row shows responses for 20 dB above threshold at two coronal locations through the cortex, whereas the bottom row demonstrates responses at a sound level of 50 dB above threshold. Note the larger area of activation and the higher (red) correlation magnitude of the response at the higher stimulus level. Adapted with permission from Lasota et al. (2003).

In a more extensive investigation of the effects of signal intensity, 6 dB steps were used between 42 and 96 dB SPL (10 total intensity levels) for a 300 Hz tone presented monaurally to the left ear of subjects (Hart et al.,2002). Three regions of analysis included a primary area on Heschl's gyrus and two nonprimary areas: one area lateral to Heschl's gyrus and the posterior part of the auditory cortex, that of planum temporale. As signal intensity was increased, there was a nonlinear increase in the extent and magnitude of cortical activation. Specifically, Heschl's gyrus showed more sensitivity to increases in level compared to the two nonprimary areas. In a follow-up study, Hart et al. (2003) examined a similar intensity range (i.e., 42–96 dB SPL) in Heschl's gyrus using a low- (300 Hz) and high-frequency (4750 Hz) tonal stimulus presented monaurally to the left ear. Analysis of the number of activated voxels suggested that the 4750 Hz stimulus elicited a growth in activation that was steady across levels, whereas the 300 Hz stimulus showed smaller changes below 66 dB SPL followed by a sharp increase up to the highest intensity tested. This difference in the growth of activation was not as pronounced when analyzing the mean percentage of signal changes, in which case the difference in growth functions between frequencies was significant at the highest intensity only, that of 96 dB SPL. Additional studies are needed to determine the effects of intensity relative to frequency on the hemodynamic response in normal-hearing individuals. This information is required prior to the application of fMRI in the hearing-impaired population, where detection thresholds and growth of loudness will vary across frequencies and subjects as well as within subjects (i.e., between ears).

Stimulus Rate

Given that the hemodynamic response stems from neural activity, and every neural event, such as an action potential or a postsynaptic membrane potential, is followed by a refractory period during which time the unit may not respond, stimulus rate could influence the magnitude of the BOLD response. In addition, it is uncertain whether auditory responses measured with different techniques would have the same apparent response to a stimulus because the physiologic responses on which the technique is based are different. For example, in an fMRI study using rates of 0.17 to 2.5 Hz for speech syllables, the percent signal change of activated areas increased with increasing rate and was monotonic and nonlinear (Binder et al.,1994). Similarly, a single-subject case study was employed to evaluate the effects of stimulus presentation rate using fMRI and also showed the response to be nonlinear (Rees et al.,1997). This same study assessed the cerebral blood flow response using PET to the same stimuli and, in contrast, the results demonstrated a linear response. The differing results between fMRI and PET suggests a more complex relationship between neural activity, cerebral blood flow, and changes in oxygenation. There were also methodological differences between the two studies that could account for some of the differences, such as passive listening of nouns in a single subject (Rees et al.,1997) compared to an active discrimination task of phonemes performed by five subjects (Binder et al.,1994).

Monaural Compared to Binaural Stimulation

In the auditory system, the pathway from each ear to the contralateral cortical hemisphere is comprised of more nerve fibers than is the pathway from each ear to the ipsilateral hemisphere. Evoked responses to monaural stimulation are stronger in the contralateral compared to the ipsilateral hemisphere (Wolpaw and Penry,1977). In studies of binaural stimulation, however, EEG and MEG recordings have shown greater responses in both hemispheres and more bilateral cortical activation patterns (Loveless et al.,1994). Similar findings have been reported using fMRI and hemodynamic responses when tonal and speech (consonant-vowel syllables) stimuli were presented in the monaural and binaural conditions (Scheffler et al.,1998; Jäncke et al.,2002). That is, there was stronger contralateral activation for monaural compared to binaural stimulation, and the strongest responses were evoked by binaural presentations. (An example of the effects of monaural and binaural stimulation using fMRI in one of our normal-hearing subjects is shown in Fig. 4.) Therefore, it is imperative that the listening condition with respect to monaural and binaural stimulation be taken into consideration when interpreting the activation patterns in left and right cortical hemispheres.

Figure 4.

fMRI activations for speech stimulus /ba/ presented in the right ear (RE), left ear (LE), and binaural (Bin) at 80 dB HL in normal-hearing subjects. Cortical activation of the transverse temporal gyrus and planum temporale is shown for each slice. Axial slices displayed in radiological orientation (right hemisphere to the left, left hemisphere to the right). Note the more contralateral activation for the monaural presentations and the nearly symmetric activation in response to the binaural stimulus.

Stimuli Presented in Quiet Compared to Noise

In animal studies, single auditory nerve fibers (Kiang,1965) and cochlear nucleus units (Burkard and Palmer,1997) show a decrease in response magnitude for click stimuli in the presence of broadband noise. Electrophysiologic studies in humans of the effects of noise with click-evoked brainstem responses indicate that the magnitude of the response diminishes and latency increases as the level of the noise is increased (Burkard and Hecox,1983). In a study of children with diagnosed learning problems, brainstem and cortical neurophysiologic responses showed abnormalities when compared to normal controls, but only when the stimuli were presented in background noise (Cunningham et al.,2001; Kraus,2001). In neuroimaging studies, the effects of the scanner background noise have been the focus of attention rather than the effects of noise as a direct stimulus masker during auditory presentations through headphones. Because background noise challenges the auditory system, its presence may result in changes to the functional asymmetry of speech and sound processing.

In summary, patterns of cortical activation evoked with auditory signals are dependent on the stimulus parameters (e.g., frequency, stimulus level, rate), listening condition (e.g., monaural or binaural), and complexity (e.g., tonal, speech, presence of background noise). In order to compare results of studies that have used different methodologies (i.e., electrophysiology, MEG, PET, or fMRI), it will be important to understand the neural mechanisms at play with each technique. Careful consideration of the stimulus variables associated with each experimental design is necessary for comparison and interpretation of data across studies using different methods. The establishment of normative data for an identified set of stimulus parameters, recording procedures, and data analyses for a given technique would provide an avenue for comparison of outcomes.


Hemispheric differences are evident in the normal processing of speech sounds (Phillips and Farmer,1990). The term “laterality” also implies hemispheric differences and refers to the dominance of one hemisphere with regard to a particular function. The lateralization of auditory language function to the left hemisphere was reported in early studies (Kimura,1961; Geschwind and Levitsky,1968; Geschwind,1972). However, more recent studies suggest that lateralization may be more related to the nature of rapidly changing acoustic cues rather than whether an acoustic signal is speech or nonspeech (Phillips and Farmer,1990; Zatorre et al.,1992; Tallal et al.,1994). Even as young as infancy, evoked potentials elicited with strings of syllables show significantly larger responses over the left hemisphere compared to the right, suggesting a possible functional asymmetry for processing short syllables in the left hemisphere (Dehaene-Lambertz and Dehaene,1994). Dichotic listening tasks, in which two different auditory stimuli are presented to both ears at the same time, indicate that subjects are more accurate in their recognition of stimuli in the right compared to the left ear. This right-ear advantage supported the theory that the left hemisphere contralateral to the right ear was specialized for language (Kimura,1967).

Current studies support the notion that the left auditory cortex responds to temporal changes, whereas the right auditory cortex responds to frequencies or spectral content (Liégeois-Chauvel et al.,1999,2001; Zatorre et al.,2002). Even when nonlinguistic stimuli (i.e., pure tones with frequency glides of either short or long duration) have been presented to normal-hearing subjects, PET scan results demonstrate blood flow changes in left cortical areas and right cerebellum supporting left hemisphere processing of acoustic transients (Johnsrude et al.,1997). In a study by Belin et al. (1998), PET data indicate that the right auditory cortex responded only to a slow rate formant transition of 200 msec, whereas the left auditory cortex responded to either 40 or 200 msec rates. These data suggested that the left cortex had an enhanced ability to respond to fast formant transitions. Taken together, these studies challenge the notion that the left hemisphere is specialized for speech and the right hemisphere is specialized for music. Instead, the data indicate that the processing of fast temporal cues, albeit critical for speech processing, occurs best in left auditory cortex and that tonal or spectral information is more efficiently processed in the right auditory cortex (Zatorre et al.,2002).

At the same time, studies of linguistic processing in tonal languages (Gandour et al.,1998,2004) and left hemisphere processing of visual sign languages by individuals with profound hearing impairment (Petitto et al.,2000; Finney et al.,2001) suggest that the left hemisphere may have a specialized role in the processing of language and communication. When subjects listened to a vowel that existed in their native language, larger electrophysiologic responses were recorded in the left compared to the right hemisphere (Näätänen et al.,1997). In contrast to this asymmetry, subject responses were similar in magnitude in the two hemispheres when presented with a nonprototype of the vowel. Therefore, the linguistic or acoustic nature of the stimulus may dictate the involvement of each cortical hemisphere. In a recent review, Zatorre et al. (2002) proposed that perhaps hemispheric asymmetries exist to meet the need of optimizing both temporal and spectral processing during everyday listening and challenging communication demands. Therefore, with two systems, one in each cortical hemisphere, temporal and spectral processing demands can be serviced by the partnership of the two systems relative to the listening environment.


There are asymmetries in the anatomical structures in the left and right hemisphere in the human auditory cortex. For example, Penhune et al. (1996) studied the location and extent of the area of the primary auditory cortex in humans and found that the left primary auditory cortex was larger than the right, and that white matter volume was greater in the left Heschl's gyrus than the right. Differences in the cell organization of the right and left hemispheres also have been observed. The left auditory cortex has larger layer III pyramidal cells, wider cell columns, and contact with a greater number of afferent fibers compared to the right hemisphere (Seldon,1981a,1981b,1982; Hutsler and Gazzaniga,1996).

Planum temporale, posterior to Heschl's gyrus, is thought to be a key site in communication processing in humans. Measurements of the surface area of planum temporale (Geschwind and Levitsky,1968) and cytoarchitectonic studies (Galaburda et al.,1978) in humans give evidence that planum temporale is larger on the left than on the right. More recent findings suggest there is also asymmetry in the volume of white matter and in the extent of myelination in axons (Anderson et al.,1999) between left and right planum temporale. The fact that myelination and axon number are greater in the left compared to the right planum temporale and auditory cortex region suggests that the left hemisphere may contribute to faster transmission and better temporal resolution, ideal for the transmission of rapidly changing speech cues (Zatorre et al.,2002; Hutsler and Galuske,2003).

The corpus callosum, the neural pathway that connects the right and left hemispheres with the largest fiber tract in the brain, is thought to contribute to hemispheric asymmetries by virtue of either inhibitory or excitatory functions (for review, see Bloom and Hynd,2005). Enhancement of the contralateral hemisphere and suppression of the ipsilateral hemisphere may occur across corpus callosum, resulting in dominance of one hemisphere for a particular stimulus or function. At this time, the role of interhemispheric connections and the transfer of auditory information between hemispheres is uncertain.


Although anatomical hemispheric asymmetries were originally thought to exist only in humans, subsequent studies support asymmetries in animals as well. For example, Gannon et al. (1998) showed left hemisphere dominance of planum temporale in chimpanzees. Of the 18 subjects, 94% had a larger left than right planum temporale. The significance of this finding with respect to the evolution of humans and the role of planum temporale in communication is not entirely clear. Further evidence of laterality in nonhuman primates was demonstrated by Petersen et al. (1978) in Japanese macaques for which a right ear advantage was noted during analysis of communicatively relevant acoustic dimensions (i.e., peak fundamental frequency in the primate call). In a study of avian song perception (Cynx et al.,1992), hemispheric dominance was assessed by lesioning the ipsilateral auditory nucleus of the thalamus, which interrupted the input to either the right or left hemisphere. Using a behavioral song discrimination task, the birds demonstrated a left-side task-specific dominance, suggesting that the right and left hemispheres process sounds differently. In the mouse brain (Ehret,1987), the left hemispheres of maternal mice showed preferential recognition of communication calls of their young offspring. Results support a right ear and therefore left hemisphere advantage for the processing of sounds involved in communication in mice.


There is evidence that asymmetric patterns of activation occur at a subcortical level. King et al. (1999) studied differences in neural representations by recording within the right and left medial geniculate bodies of anesthetized guinea pigs. Stimuli were 2000 Hz tone bursts, clicks, and speech (i.e., synthesized /da/) stimuli presented at 85 dB SPL to the right, left, and both ears. Onset response amplitudes were measured and were larger in the left compared to the right auditory thalamus in 10 of 12 animals, suggesting some degree of asymmetry at a subcortical level.

Asymmetry has been suggested at the level of the cochlea in a study of infants using either transient-evoked otoacoustic emissions (TEOAE; elicited with clicks) or distortion product otoacoustic emissions (DPOAE; elicited with continuous tones) (Sininger and Cone-Wesson,2004). In infants, a significant effect of OAE type and ear of stimulation was found. (Click TEOAEs were larger when evoked in right ears and tonal DPOAEs were larger when evoked in left ears.) Since otoacoustic emission measures reflect activity of the outer hair cells, these findings suggest that some differentiation in acoustic stimulus processing may occur at peripheral levels, thereby facilitating higher-level hemispheric sound processing.


Asymmetric response patterns are affected, however, by the type of stimulus (e.g., tones, clicks, speech). For example, the degree of asymmetry measured with evoked responses when recording from the medial geniculate bodies in guinea pigs was significantly different for a synthesized speech stimulus /da/ compared to a 2000 Hz tone or a click (King et al.,1999). In this study, there appeared to be a continuum with respect to the amount of asymmetry, with the greatest asymmetry noted for the speech stimulus followed by the click and then the tone, which showed no asymmetry. Using a novel continuum of auditory signals in humans, Rinne et al. (1999) tested the hypothesis that hemispheric activation changes would occur from right to left hemisphere as stimuli changed from nonspeech to speech. Electrophysiologic responses showed that activation was greater in the right cortical hemisphere for a tonal stimulus with a shift in activation to the left hemisphere as stimuli became more phonetic in nature. Taken together, these results indicate that asymmetric patterns are stimulus-dependent, and that right and left hemispheres have different roles in the processing of acoustically complex signals.

Hemispheric asymmetries have been reported to differ at varied stimulus intensities. Hart et al. (2002) reported that for monaural presentation of tones to the left ear and at low stimulus levels (i.e., below 72 dB SPL), the extent of activation in right and left Heschl's gyrus (HG) was similar. At higher levels such as 92 and 96 dB SPL, contralateral hemispheric activation in HG was significantly greater compared to the ipsilateral hemisphere. This result was not observed for two other regions studied, an area lateral to HG and planum temporale, the posterior part of the auditory cortex. The authors noted that these findings suggest that HG may have a greater role in the processing of intensity levels than the nonprimary areas of interest and that intensity may affect hemispheric asymmetric response patterns.

The effects of background noise on hemispheric asymmetry were assessed using the magnetic equivalent (MMNm) of the mismatch negativity response in human subjects (Shtyrov et al.,1998). Using an odd-ball paradigm, speech stimuli /pa/ (standard stimulus) and /ka/ (deviant stimulus) were presented binaurally at 60 dB above threshold to normal-hearing subjects. The MMNm response is obtained by subtracting the standard stimuli responses from those elicited by the deviant stimuli. Three stimulus conditions were evaluated, speech in quiet and speech in two white noise conditions (+15 and +10 signal-to-noise ratios). MMNm peak amplitudes and dipole moments indicated that responses were larger in the left hemisphere in the quiet condition. In the noise conditions, the hemispheric asymmetry decreased and the responses in the right hemisphere were increased. These results suggest that noise may disrupt the more typical hemispheric asymmetry with redistribution to the right cortex.


Animal studies indicate that unilateral hearing loss modifies the asymmetric cortical response patterns of the auditory cortex (Kitzes,1984; Reale et al.,1987). In addition, the amount of asymmetry in humans has been shown to differ between normal-hearing (NH) and unilateral hearing loss (UHL) subjects in response to tone bursts using magnetoencephalographic responses (Vasama and Makela,1995), click stimuli using auditory-evoked potentials (Ponton et al.,2001), and pulsed tonal stimuli using fMRI (Scheffler et al.,1998). In a study of hemispheric activation, measures obtained ipsilateral and contralateral to the ear of stimulation using clicks in hearing loss subjects showed hemispheric ear-dependent differences (Khosla et al.,2003).

In a current study (Firszt et al.,2005), the effects of unilateral profound hearing loss on hemispheric activation patters using speech stimuli were investigated. Normal-hearing subjects (n = 9) and subjects with profound hearing loss in the left ear and normal hearing in the right ear (n = 7) were presented a speech stimulus /ba/ in the right ear at 80 dB HL. Hemispheric activation measures were obtained using fMRI with a 1.5 T magnet and sparse sampling method (Hall et al.,1999). In normal-hearing subjects, stimulation in the right ear resulted in greater activation in the contralateral (left) hemisphere compared to the ipsilateral (right) hemisphere. For the unilateral hearing loss subjects, a decrease was seen in the contralateral hemisphere and increase in the ipsilateral hemisphere compared to normal-hearing subjects. An example from a normal-hearing and unilateral hearing loss subject is displayed in Figure 5 and group data are shown in Figure 6. Comparison of the contralateral and ipsilateral hemispheres for the subjects indicates that cortical asymmetry is strong in response to a speech stimulus for the normal-hearing subjects, and a notable decrease in asymmetry occurs for the unilateral hearing loss subjects. For this group, the asymmetry reduction is a result of both a decrease in the left (contralateral) hemisphere and an increase in the right (ipsilateral) hemisphere. These findings are consistent with those of previous reports in humans (Scheffler et al.,1998). This study documents such findings with the use of a speech stimulus and a neuroimaging technique. Further experiments are underway to explore the functional significance of changes in hemispheric asymmetry in individuals with unilateral profound hearing loss using complex stimuli. These directions should provide insight into the reorganization of the auditory system when sound deprivation occurs.

Figure 5.

Comparison of a normal-hearing (NH; left image) and unilateral hearing loss (UHL; right image) subject. Speech stimulus /ba/ is presented at 80 dB HL in the right ear of both subjects. The UHL subject has normal hearing thresholds in the right ear and profound hearing loss in the left ear. Axial slices shown in radiological orientation (right hemisphere to the left, left hemisphere to the right). The NH subject shows greater asymmetry between hemispheres compared to the UHL subject for whom activation is more balanced.

Figure 6.

Averaged hemispheric activation (voxel number in transverse temporal gyrus and planum temporale) for the left and right hemispheres for normal-hearing (NH) and unilateral hearing loss (UHL) subjects displayed for speech stimulus /ba/ presented to the right ear at 80 dB HL. In NH subjects, stimulation results in greater activation in the contralateral (left) hemisphere compared to the ipsilateral (right) hemisphere. For UHL subjects and right ear stimulation (left ear deafness), a decrease is seen in the contralateral hemisphere and increase in the ipsilateral side, resulting in less asymmetry compared to NH subjects.


Changes in hemispheric asymmetry may have a negative impact on the ability to process fast acoustic transitions such as those that are necessary to perceive speech. In addition, it may be that some clinical populations or subject characteristics are more vulnerable to changes than others. Individuals with left hemisphere cortical damage show speech perception difficulties (Auerbach et al.,1982). Studies using electrical stimulation mapping of epileptic patients show left hemisphere specialization for language (Ojemann,1983). In children with learning problems, atypical hemispheric specialization has been reported behaviorally and at the neural level (Dawson et al.,1989; Mattson et al.,1992). Neuroimaging studies have found a loss of asymmetric characteristics between the left and right hemispheres for language impaired populations, including those with dyslexia (Leonard et al.,1993; Galaburda et al.,1994).

An electrophysiologic study in the elderly indicated age-related changes in cortical hemispheric patterns (Bellis et al.,2000). Synthetic speech syllables were used to elicit the neurophysiologic P1-N1 response in children, young adults, and older adults (over 55 years of age). Stimuli were presented at 75 dB SPL monaurally to the right ear. The cortical response under study was asymmetric in the two younger groups favoring the left hemisphere, whereas responses were symmetric in the elderly group and attributable to an increase in right hemispheric activation. The older subjects also demonstrated significantly poorer abilities to discriminate rapid spectrotemporal changes in speech compared to the two younger groups, a finding consistent with many reports of speech perception difficulties in the elderly (Jerger et al.,1994; Fitzgibbons and Gordon-Salant,2001).

Neuroimaging techniques have been used to study the physiologic mechanism of tinnitus and have shown that the loudness of tinnitus is reflected in increases and decreases in cortical responses (Cacace et al.,1996; Giraud et al.,1999; Lockwood et al.,1999a) and that asymmetric patterns are disturbed (Melcher et al.,1999). In subjects with normal hearing and monaural (i.e., lateralized) tinnitus, greater asymmetric responses in the inferior colliculi have been noted compared to control subjects (Levine et al.,1998; Melcher et al.,1999). Taken together, these findings, in addition to previous results, suggest that there may be functional implications for atypical hemispheric patterns and that the loss of normal asymmetries (i.e., symmetric responses or exaggerated asymmetric responses) should be explored further. Neuroimaging tools may ultimately provide insight to the functionality of cortical hemispheric differences in those with clinically significant impairments.


Auditory hemispheric patterns have been shown to change or reorganize with sound deprivation (e.g., unilateral hearing loss) and may therefore also change with sound experience or training specific to the auditory system. Behavioral studies indicate that auditory training improves speech recognition in individuals with hearing loss and is a critical component for the development of communication in children with substantial hearing impairment. If auditory training results in greater and more precise neural activity, changes in cortical activation may follow.

There are few studies of neurophysiologic changes recorded before and after training of the auditory pathway. In normal-hearing subjects, cortical electrophysiologic responses (i.e., mismatch negativity response, or MMN) evoked with nonnative speech syllables were initially symmetric in human subjects. However, responses were larger for the left hemisphere compared to the right following training (Tremblay et al.,1997). In children with learning problems who received training, some cortical responses improved in morphology and response areas were shifted more to the left rather than the right hemisphere (King et al.,2002).

A number of recent studies have looked at training effects in other sensory regions using neuroimaging measures, such as in the visual cortex (Kourtzi et al.,2005; Sigman et al.,2005) and motor cortex (Nyberg et al.,2005; Puttemans et al.,2005). For example, neural changes in response to category learning were assessed with fMRI by monitoring regions of activation before and after training (Little et al.,2004). This comparison showed that early training resulted in both behavioral increases in accuracy and response time and increases in volume of activation in regions known to be involved in visual and spatial processing. As training progressed, activation subsequently decreased, suggesting that there is variation in the time course of regional activation that is dependent on the course of learning.

Additional experiments using neuroimaging techniques to study the effects of auditory training and plasticity on hemispheric activation patterns in human subjects may provide further insight into how the auditory system changes with experience. If behavioral changes are supported by neural changes that result in more efficient neural connections in a specific auditory region, increased activation in the specified region may result. Behavioral improvements may also manifest as a recruitment of additional regions, therefore redefining the region and corresponding volume of activation.


Knowledge about the structure and function of the human auditory cortex, including the study of hemispheric asymmetry, has been facilitated by various methods in auditory neuroscience, including recent neuroimaging techniques. Anatomic studies in humans and animals give evidence that asymmetries exist in both cortical and subcortical structures. Hemispheric asymmetries are affected by a number of variables, including stimulus type (e.g., tones compared to speech, speech in quiet compared to speech in noise), presence of pathology (e.g., hearing loss, tinnitus), and subject characteristics (e.g., children with learning problems, dyslexia). Although it appears that the right and left hemispheres are not identical in structure or function, and that there is a correlation between anatomic, structural, and functional asymmetries, there is much to be learned about the differential representation of speech sounds in human cerebral hemispheres. Neuroimaging techniques, such as fMRI, will accelerate our understanding of human auditory cortex and the relationship between structure and function.