• Open Access

Assessment of tonotopically organised subdivisions in human auditory cortex using volumetric and surface-based cortical alignments


  • Dave R.M. Langers

    Corresponding author
    1. National Institute for Health Research Nottingham Hearing Biomedical Research Unit, School of Clinical Sciences, University of Nottingham, Queen's Medical Centre, Nottingham, United Kingdom
    2. Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
    • NIHR Nottingham Hearing Biomedical Research Unit, Ropewalk House, 113 The Ropewalk, Nottingham NG1 5DU, United Kingdom. E-mail: davey.langers@nottingham.ac.uk

    Search for more papers by this author


Although orderly representations of sound frequency in the brain play a guiding role in the investigation of auditory processing, a rigorous statistical evaluation of cortical tonotopic maps has so far hardly been attempted. In this report, the group-level significance of local tonotopic gradients was assessed using mass-multivariate statistics. The existence of multiple fields on the superior surface of the temporal lobe in both hemispheres was shown. These fields were distinguishable on the basis of tonotopic gradient direction and may likely be identified with the human homologues of the core areas AI and R in primates. Moreover, an objective comparison was made between the usage of volumetric and surface-based registration methods. Although the surface-based method resulted in a better registration across subjects of the grey matter segment as a whole, the alignment of functional subdivisions within the cortical sheet did not appear to improve over volumetric methods. This suggests that the variable relationship between the structural and the functional characteristics of auditory cortex is a limiting factor that cannot be overcome by morphology-based registration techniques alone. Finally, to illustrate how the proposed approach may be used in clinical practice, the method was used to test for focal differences regarding the tonotopic arrangements in healthy controls and tinnitus patients. No significant differences were observed, suggesting that tinnitus does not necessarily require tonotopic reorganisation to occur. Hum Brain Mapp 35:1544–1561, 2014. © 2013 Wiley Periodicals, Inc.


Topographic representations prominently occur in the mammalian brain and are postulated to be important, if not fundamental, for the processing of sensory information [Kaas, 1997; Weinberg, 1997]. Except for olfaction [Stettler and Axel, 2009], all five classical senses feature representations in primary sensory cortex that mirror the spatial arrangement of receptors in the sensory epithelia: visual cortex contains retinotopic maps representing retinal location [Engel et al., 1997], sensorimotor cortex contains somatotopic maps of the body [Mattay and Weinberger, 1999], auditory cortex contains tonotopic—a.k.a. cochleotopic—maps reflecting the cochlear frequency representation [Kaas and Hackett, 2000] and gustatory cortex contains gustotopic maps of the oral taste fields [Chen et al., 2011]. Brain researchers often rely on these topographic maps as a tool to discriminate between meaningful cortical subdivisions. For example, reversals between multiple topographic maps allow distinctions to be made between visual areas V1, V2 and V3 [Wandell and Winawer, 2011], primary somatosensory Brodmann areas 1, 2, 3a and 3b [Sanchez-Panchuelo et al., 2012], and core auditory fields AI, R and RT [Bendor and Wang, 2005]. The ability to accurately delineate cortical fields makes it possible to identify and study other distinguishing functional characteristics of such subdivisions as well. This has resulted in significant progress concerning our understanding of the hierarchical processing that occurs in various central sensory systems.

In the neuroimaging literature, considerable effort has been devoted to the development of solid statistical methods to assess stimulus-evoked brain activation. With this in mind, it is highly surprising that the consistency of topographic maps across subjects is typically assessed merely by visual inspection. Statistical assessments have so far been limited to the testing for differential responses to distinct stimuli. This may reveal and localise stimulus selectivity, but it does not prove the existence of a systematic and gradual spatial variation in response characteristics that would be required for a topographic map to exist. In particular, statistical parametric maps that report the significance of a local topographic gradient never seem to have been generated. This may be related to the fact that, unlike activation levels, topographic progressions cannot simply be quantified by a single scalar number. Instead, they are characterised by a multidimensional vector entity that indicates the local direction and magnitude of steepest change in sensitivity with regard to the stimulus characteristic of interest, or—equivalently—the local orientation and density of the corresponding isocontours. Instead of the common statistics that underlie mass-univariate t- and F-tests, their assessment therefore requires the use of mass-multivariate generalisations.

This report will focus on audition exclusively. Tonotopic maps are comparatively simple in the sense that they map one stimulus parameter only: sound frequency. Other acoustic parameters (e.g., pitch, bandwidth, modulation rate and intensity) have been suggested to be represented in auditory cortex as well [Barton et al., 2012; Bilecen et al., 2002; Hall et al., 2006; Moerel et al., 2012; Overath et al., 2012; Woods et al., 2010], but evidence that these representations might be topographic is weak. Moreover, the peripheral representation in the cochlea of the inner ear is essentially one-dimensional. In contrast, retinotopy and somatotopy inherently appear two-dimensional, whereas gustotopy has a discrete nature. Therefore, the tonotopic organisation of the central auditory system seems an ideal test case to explore first. In addition, the statistical assessment of tonotopic gradient maps in the brain is particularly pertinent for a number of practical reasons.

First, the tonotopic organisation of auditory cortex has long remained elusive and has only recently been clarified in humans. Early reports suggested the existence of tonotopy, but lacked the resolution to distinguish between multiple fields in the same hemisphere [Lauter et al., 1985; Romani et al., 1982; Wessinger et al., 1997]. About one decade ago, a couple of seminal studies identified more detailed tonotopic maps that involved two to six progressions per hemisphere [Formisano et al., 2003; Talavage et al., 2004], but their interpretations were hard to reconcile. Recent studies have resulted in a remarkably consistent picture regarding the frequency representation in human auditory cortex. Nevertheless, it remains ambiguous whether core auditory fields are aligned more or less along Heschl's gyrus (HG) [Moerel et al., 2012] or whether they fold perpendicularly across [Humphries et al., 2010]. Furthermore, additional tonotopic progressions possibly exist in surrounding areas [Striem-Amit et al., 2011], but this remains to be established. A means to objectively assess the existence of tonotopic gradients would be highly valuable to assist in the resolution of these issues.

Second, there has been some debate regarding the optimal method to account for differences in cortical morphology. Across individuals and hemispheres, HG that hosts primary auditory cortex can occur as a single gyrus, or it may be forked or duplicated [Leonard et al., 1998]. At the same time, the relationship between cortical structure and function is unclear [Rademacher et al., 2001]. Some authors have successfully employed volumetric normalisation to construct detailed tonotopic maps [Langers and van Dijk, 2012], whereas the results of others suggest that the use of surface-based registration techniques should improve performance [Da Costa et al., 2011]. It has already been demonstrated that sophisticated non-linear co-registration can improve sensitivity over simpler affine methods in primary auditory regions, and that surface-based techniques may perform better still [Desai et al., 2005; Tahmasebi et al., 2012; Tucholka et al., 2012]. However, the latter studies did not specifically attempt to assess the co-registration of functional subdivisions within auditory cortex. This study aims to compare the consistency of the volumetric- and surface-based localisation of tonotopic fields.

Third, various clinical applications exist that may benefit from a rigorous assessment of topographic maps. In particular, tinnitus is a prevalent disorder characterised by the perception of a phantom sound in the absence of an acoustic sound source. Theories regarding the pathophysiology of tinnitus assign an important causal role to tonotopic remapping [Bartels et al., 2007; Herraiz et al., 2009; Roberts et al., 2010]. It is hypothesised that peripheral hearing loss in tinnitus patients induces deprivation in neurons that are tuned to the affected frequencies. This encourages plastic shifts in characteristic frequency to occur towards the intact frequency regions, resulting in a distorted tonotopic map. The ensuing over-representation of the edge frequency of hearing loss would presumably alter the levels of spontaneous activity and synchronicity, which could be perceived as tinnitus. Thus, a measurable neural correlate of tinnitus might be an abnormal tonotopic organisation. Animal studies have indeed revealed substantial changes in tonotopic maps in the presence of tinnitus, either induced by acoustic trauma or by salicylate administration [Stolzberg et al., 2011; Yang et al., 2011]. Behavioural signs of tinnitus disappeared when these changes were reversed, further suggesting a direct causal relationship [Engineer et al., 2011]. Some neuroimaging data in humans also indicate that tonotopic reorganisation occurs in tinnitus patients [Mühlnickel et al., 1998; Wienbruch et al., 2006]. However, in a recent high-resolution functional magnetic resonance imaging (fMRI) study tonotopic representations in the auditory cortex were not judged to differ between tinnitus patients and matched controls, leading the authors to conclude that macroscopic tonotopic reorganisation is not required for tinnitus to arise [Langers et al., 2012]. The methodology that is described in this article was developed in an effort to investigate the tonotopic organisation in more detail and with better precision.

In this study, tonotopic maps were determined in 80 hemispheres. Results will be described and discussed in relation to the three open issues listed above.



Forty subjects were included in this fMRI study on the basis of written informed consent and in approved accordance with the requirements of the medical ethical committee at the University Medical Center Groningen in the Netherlands. Subjects reported no history of neurological or psychiatric disorders. Twenty control subjects had normal hearing (CON: gender 4♂, 16♀; handedness 17R, 3L; age 33 ± 13, range 21–60 years), whereas the other 20 subjects suffered from chronic subjective tinnitus (TIN: gender 8♂, 12♀; handedness 19R, 1L; age 46 ± 11, range 26–60 years). These same groups were reported earlier [Langers and van Dijk, 2012; Langers et al., 2012], but the previous analyses differed from those in this study.

Hearing thresholds were determined by means of tone audiometry for the left and right ear separately. Averages over both ears are shown in Figure 1. Pure tone average thresholds across the six frequencies that were used in the experiment (0.25–8.00 kHz) equalled 5 ± 5 dB HL in the CON group and 8 ± 5 dB HL in the TIN group (mean ± SD); the groups did not differ significantly (P > 0.1).

Figure 1.

Hearing thresholds were measured at all octave frequencies from 0.25 to 8.00 kHz. Results were averaged over both ears and shown by means of box plots (showing inter-quartile ranges for both subject groups). Stimuli were presented at two intensity levels that differed by 20 dB, approximately indicated by the horizontal grey lines.

Imaging Paradigm

During the imaging session, subjects were placed supinely in the bore of a 3.0-T MR system (Philips Intera, Best, The Netherlands) that was equipped with an eight-channel phased-array transmit/receive head coil. The functional imaging session included three 8-min runs, each consisting of a dynamic series of 40 identical high-resolution T2*-sensitive gradient-echo echo-planar imaging (EPI) volume acquisitions (TR, 12.0 s; TA, 2.0 s; TE, 22 ms; FA, 90°; EPI-factor, 37; SENSE-factor, 2.7; matrix, 128 × 128 × 40; resolution, 1.5 × 1.5 × 1.5 mm3; interleaved slice order, no slice gap). A sparse, clustered-volume sequence was employed to avoid interference from acoustic scanner noise [Edmister et al., 1999; Hall et al., 1999]. The scanner coolant pump and fan were turned off during imaging and subjects wore foam ear plugs to further diminish ambient noise levels. The acquisition volume was positioned in an oblique axial orientation, tilted forward parallel to the Sylvian fissure and approximately centred on the superior temporal sulci. Preparation scans were acquired to achieve stable image contrast and to trigger the start of stimulus delivery, but these were not included in the analysis.

During the 10 s of silence between functional acquisitions, a sequence of 50 identical 100-ms tone stimuli was presented at a rate of 5 Hz by means of MR-compatible electrodynamic headphones (MR Confon GmbH, Magdeburg, Germany) [Baumgart et al., 1998]. The frequencies of the tones equalled 0.25, 0.50, 1.00, 2.00, 4.00 or 8.00 kHz, and stimuli were presented at either approximately 30 or 50 dB SPL. An additional silent condition was included. For each tone sequence independently, the stimulus condition (i.e., frequency and intensity, or silence) was randomly chosen.

In an attempt to control their state of attention, subjects performed an engaging visual/emotional task that was unrelated to the sound stimuli. Subjects were instructed to empathise with series of pictures [Lang et al., 2008], decide whether the affective valence of the depicted scenes was positive, negative or neutral, and respond by means of button presses. This paradigm was described in more detail previously [Langers and van Dijk, 2012].


Data were pre-processed using the SPM8b software package (Wellcome Department of Imaging Neuroscience, http://www.fil.ion.ucl.ac.uk/spm/) [Friston et al., 2007]. Contrast differences between odd and even slices due to the interleaved slice order were eliminated by interpolating between pairs of adjacent slices, shifting the imaging grid over half the slice thickness. Functional imaging volumes were corrected for motion effects using rigid body transformations, and the anatomical images were co-registered to the functional volumes. All images were normalised into Montreal Neurological Institute (MNI) stereotaxic space using a default 25-mm non-linear frequency cut-off, and re-sampled at a 2-mm resolution. Finally, a logarithmic transformation was carried out in order to express all derived voxel signal measures in natural units of percentage signal change relative to the mean.

Individual cortical surface meshes were generated using the standard processing pipeline of the FreeSurfer v5.1.0 software package (Martinos Center for Biomedical Imaging, http://surfer.nmr.mgh.harvard.edu/) [Dale et al., 1999]. Vertex coordinates were transformed to MNI space using the SPM8b normalisation parameters. Per subject, the resulting white matter and pial surfaces that bound the grey matter on the interior and exterior side were interpolated midway to obtain one surface per hemisphere that ran centrally through the cortical grey matter. Proper alignment was verified through visual inspection by overlaying the extracted surface on top of the matching anatomical image. Meshes were mapped onto a sphere and registered across subjects by a warp that optimised the match with the cortical gyration pattern of a standard morphological template [Fischl et al., 1999]. All individual meshes were then re-sampled to the same icosahedral grid, resulting in a one-to-one correspondence of vertices across all subjects. The average vertex density equalled 138 vertices/cm2 on the cortical surface in MNI space, indicating that neighbouring vertices were typically spaced at ∼1 mm. Functional data were assigned to all vertices by trilinear interpolation of the normalised functional volumes.

Finally, the three-dimensional functional image volumes were smoothed by convolution with a 5-mm full-width at half-maximum (FWHM) Gaussian kernel. In parallel, the two-dimensional functional surface data were smoothed by repeatedly averaging all vertices with their neighbours in the grid until an average equivalent FWHM of 5 mm was reached; in a supplementary analysis, surface-based smoothing was increased to an FWHM of 8 mm.

Frequency Gradients

Per subject, a general linear regression model (GLM) was constructed that included 12 regressors of interest, modelling the 6 frequencies × 2 intensity levels, relative to silence. Additional variables were included to account for task effects (two regressors modelling the reported positive or negative affective valences, relative to neutral), residual motion effects (six regressors containing translation and rotation parameters) and baseline and drift effects (12 regressors modelling a 3rd-degree polynomial for each of the three runs separately). The GLM was applied to the smoothed data of all voxels and vertices, and the extracted responses were averaged across the two presentation levels to obtain a single-response level β per stimulus frequency. In addition, the average response b to all stimulus conditions was determined.

To quantify frequency tuning, the responses to all six frequencies were combined according to the following equation:

display math(1)

The obtained frequency index f constitutes an average of the presentation frequencies (expressed as −2…+3 octaves relative to 1 kHz), weighted by the corresponding squared response magnitudes (βk2), and can therefore be regarded as the centre of mass of the “response power spectrum” on a logarithmic frequency scale.

The strength and direction of the local tonotopic organisation, as expressed by the gradient of the frequency index g = math formulaf, was subsequently determined on the basis of the (i) volumetric and (ii) surface-base subject alignments.

  1. Volumetric: This analysis was based on the volumetric whole-brain alignment (as determined by SPM8b) and was carried out voxel by voxel. A three-dimensional tonotopic gradient vector was calculated by taking into consideration the frequency indices (fk) and rectilinear MNI-coordinates (xk, yk, zk) of a central voxel (k = 0) and all its face-neighbours (Fig. 2a). This typically comprised the data of seven voxels (but possibly less for voxels at the edge of the brain). The frequency indices were fitted with respect to the three coordinates by means of least-squares regression using a linear equation
    display math(2a)
    The resulting regression coefficients formed an estimate of the local gradient g ≡ (∂f/∂x, ∂f/∂y, ∂f/∂z) = (gx, gy, gz).
  2. Surface-based. This analysis was based on the surface-based cortical alignment (as determined by Freesurfer) and was carried out vertex by vertex. A two-dimensional tonotopic gradient vector was calculated by taking into consideration the frequency indices (fk) and curvilinear spherical coordinates (uk, vk) of a central vertex (k = 0) and all its edge-neighbours (Fig. 2b). Again, this typically comprised the data of seven vertices (but possibly less for vertices in 12 icosahedral positions per hemisphere). The coordinates u and v were measured along perpendicular axes that were locally tangent to the sphere in the latitudinal and longitudinal directions, respectively; this coordinate system was therefore defined separately for each vertex. The frequency indices were fitted with respect to the two coordinates by means of least-squares regression using a linear equation
    display math(2b)
    The resulting regression coefficients formed an estimate of the local gradient g ≡ (∂f/∂u, ∂f/∂v) = (gu, gv).
Figure 2.

(a) The anatomical image volumes of subjects were registered by normalising them to a standard stereotaxic space. In the resulting cubic grid, each voxel typically has six face-neighbours that are taken into account to determine three-dimensional gradients regarding stimulus preference. (b) In parallel, cortical surfaces were determined for all subjects and registered on the basis of their gyration patterns. In the resulting triangular mesh, each voxel typically has six edge-neighbours that are taken into account to determine two-dimensional gradients regarding stimulus preference. (c,d) Flow charts summarising the volumetric and surface-based analyses, respectively. In the volumetric approach, analyses are carried out voxel-by-voxel, and the results are projected on a group cortical surface for visualisation purposes only; in the surface-based approach, data are sampled on individual cortical surfaces in an early stage, and analyses are carried out vertex-by-vertex.

Statistical Assessment

To assess whether a voxel's or vertex's mean response b significantly differed from zero at the group level, a t-value was calculated and tested against a Student's t-distribution with N − 1 degrees of freedom

display math(3)

using sample mean math formula and variance math formula. Equivalently, the square of the t-value can be tested against a Fisher–Snedecor F-distribution

display math(4)

Similarly, to assess whether the mean frequency indices significantly differed between the two groups of subjects, independent samples t-tests were performed

display math(5)

using the pooled variance estimate math formula.

These formulas, stated here in full, allow a straightforward generalisation to be made to test multivariate vectors instead of univariate scalars against the null hypothesis. Namely, to test whether the observed D-dimensional tonotopic gradient vectors gn significantly differ from the null vector (where D = 3 for volumetric gradients and D = 2 for surface-based gradients), a T2-statistic may be compared against Hotelling's T2-distribution [Hotelling, 1931]

display math(6)

using sample mean math formula and covariance matrix math formula.

To test whether observations differ between subgroups, an analogous multivariate generalisation of the independent samples t-test can be used

display math(7)

using the pooled covariance matrix estimate math formula.

Hotelling's T2-distribution can be related to the well-characterised and perhaps more familiar Fisher–Snedecor F-distribution through the correspondence

display math(8)

For D = 1, the multivariate statistics straightforwardly reduce to their univariate counterparts (compare Eqs. (4) and (6), or Eqs. (5) and (7)).

An important property of T2 is that it is invariant under all (invertible) linear transformations of the observations gn, just like a univariate t-value is insensitive to the scaling of all observations by any (non-zero) factor. This means that the outcome does not depend on the coordinate system that the observations happen to be expressed in. In this context, as long as the cortical surface is finely and smoothly tessellated, the derived statistics will be independent of the particular parametrisation and embedding of the employed mesh.

For visualisation purposes, and to facilitate the comparison with surface-based outcomes, the end results of the volumetric analyses were finally re-sampled at the vertex locations of a single group-average cortical surface that was computed by averaging the MNI-coordinates of corresponding vertices across all subjects. Figure 2c,d shows the conceptual workflow for the employed volumetric and surface-based analyses, respectively.


Tonotopic maps were constructed for all individual subjects based on Eq. (1). Brain regions that were responsive to sound were determined by testing the significance of the mean activation level b across all six stimulus frequencies using a lenient criterion P < 0.05, uncorrected for multiple comparisons. Figure 3 shows the frequency indices f across all suprathreshold vertices, projected on an inflated section of cortex corresponding with the superior surface of the temporal lobe; insets show a semi-inflated view of the entire hemisphere. A representative sample is shown, consisting of the first and the last included subjects in the CON and TIN groups; similar figures for all 40 subjects are shown in Supporting Information Figures 1 and 2 for the volumetric and surface-based analyses, respectively. In spite of some inter-subject variability and ‘missing data’ where activation failed to reach significance, the emerging tonotopic patterns were reasonably consistent. Low frequencies were virtually always observed to be expressed on the lateral side of the superior temporal gyrus (STG), adjacent to the superior temporal sulcus (STS). These low-frequency representations typically extended further inward into the Sylvian fissure at a number of sites, in particular near HG and often again on the planum temporale (PT). At the same time, high frequencies were found more medially on the superior surface of the temporal lobe, in the depth of the Sylvian fissure. Often, two regions appeared to occur, one anterior to HG in the circular sulcus (CiS) and one posterior to HG in medial Heschl's sulcus (HS), separated by representations of more moderate frequencies on medial HG itself. In some subjects (e.g., TIN-4502, in the bottom panels of Fig. 3), this led to a striking alternation of high-to-low-to-high-to-low frequencies when traversing the auditory cortex from the planum polare (PP) to the temporoparietal junction (TPJ), with the most pronounced low-frequency representations typically being offset towards the lateral side, and high-frequency representations being offset medially. In other subjects, this inter-digitated pattern was partially recognisable.

Figure 3.

Individual tonotopic maps depicting the spatial distribution of frequency indices f (converted to kHz) in four individual subjects according to the volumetric (left) and surface-based (right) analyses. The top two subjects belonged to the control group (CON); the bottom two are tinnitus patients (TIN). Supporting Information Figures 1 and 2 show similar maps for all 40 subjects according to the volumetric and surface-based analyses, respectively. Results from all vertices that showed significant sound-evoked responses (P < 0.05, uncorrected) are colour-mapped on flattened cortical representations, showing the individual gyral morphology (light, gyri; dark, sulci). Insets show a complete semi-inflated hemisphere. Despite of some inter-individual variations, low frequencies tend to be represented in lateral HG and superior temporal cortex, whereas high frequencies tend to occur more medially rostrally and caudally to HG. Note that although individual cortical features are shown in all panels to facilitate comparisons, the volumetric results were actually obtained using a fixed group-average surface.

Group-level analyses were subsequently performed. First, the significance of the mean response level b across all subjects was assessed, and a fixed region of interest (ROI) was defined containing all vertices that satisfied a lenient uncorrected criterion P < 0.05 according to the volumetric and surface-based analyses simultaneously. A total of 19,750 vertices were identified, comprising two large clusters that covered bilateral auditory cortex, plus a number of much smaller foci elsewhere. This ROI was used as a mask in all further analyses. Figure 4 shows the volumetric and surface-based group-level results side by side.

Figure 4.

Group-level outcomes based on volumetric (left) and surface-based (right) registration. The results of a supplementary surface-based analysis with 8-mm smoothing are shown in Supporting Information Figure 3. (a) The mean activation level b was determined by averaging the responses to all six stimulus frequencies, expressed as a percentage signal change relative to the silent baseline. The strongest responses were found on the caudal side of HG. (b) The group-level significance of sound-evoked responses was assessed using conventional mass-univariate statistics. (c) Frequency tuning as expressed by the frequency index f (converted to kHz) gradually varied across the sound-activated regions of auditory cortex. Low frequencies were represented laterally on HG and STG, whereas high frequencies were represented medially on the rostral and caudal banks of HG and PT. (d) Tonotopic gradients were determined in appropriate coordinate systems in the axial plane (x, y) for the volumetric analysis or in a plane locally tangent to the sphere (u, v) for the surface-based analysis, and their direction was displayed by means of a colour-code. Homogeneous strips of cortex were found to exist, separated by relatively sharp transitions in gradient direction. (e) The group-level significance of the local tonotopic gradient was tested against the null-vector using mass-multivariate statistics. Multiple coherent tonotopic patches could be distinguished.

Figure 4a shows the magnitude of the sound-evoked activation as quantified by the mean activation level b. Group-level activation was found to peak in HS or on the posterior flank of HG. Secondary maxima were present on anterior HG and on PT. Although the general spatial distribution of the activation was similar across hemispheres and methods, the surface-based analysis was found to result in notably higher overall activation levels than the volumetric analysis. This is further shown in Figure 5a which plots the activation levels according to both methods against each other. At the same time, the corresponding statistical significances were quite comparable, as shown in similar fashion in Figures 4b and 5b.

Figure 5.

The outcomes of the volumetric and surface-based analyses (Figure 4) were directly compared. Each point corresponds with the data from a single vertex on the cortical surface. The panels show: (a) mean activation levels (Fig. 4a); (b) the significance of sound-evoked activation (Fig. 4b); (c) frequency indices (Fig. 4c); and, (d) the significance of local tonotopic gradients (Fig. 4e).

In Figure 4c, the frequency indices f are plotted. A smoothly varying frequency tuning was observed, perhaps even more so for the volumetric than for the surface-based method. In all images, low-frequency tuning occurred near the lateral crest of HG and to a lesser extent posteriorly along STG; high-frequency tuning was found rostrally in CiS as well as caudally in medial PT. The volumetric and surface-based results were in good qualitative agreement. Quantitatively, the most pronounced frequency tuning appeared to be visible in the low- and high-frequency endpoints of the surface-based outcomes, but a pairwise comparison (Fig. 5c) showed that both methods resulted in comparable results overall.

Figure 4d shows the direction of the tonotopic gradient vector, expressed as a polar angle φ. For the surface-based data, the direction on the registered sphere (corresponding with the frequency map in Fig. 4c) was colour-coded. For the volumetric analysis, the direction of the gradient's component in the axial plane was used. Further it is important to note that a fixed colour-code was used for the left and right hemisphere, whereas structurally these form more or less each other's mirror image (roughly flipping the volumetric x-axis or the surface-based v-axis). For these reasons, colours in these various images are non-trivially related, and pairwise comparisons were not made. Yet, in spite of some local variations, which occurred especially in the surface-based analysis and towards the edge of the cluster, a reliable large-scale pattern could be discerned in both hemispheres for both methods. A more or less homogeneous gradient that pointed in a posterolateral-to-anteromedial direction (for low-to-high frequencies) was observable in a strip of cortex consisting of the rostral two-thirds of HG and the adjacent part of CiS. Based on the gradient direction, this was sharply demarcated from a second strip on the caudal side of HG, extending into HS and PT, in which the gradient pointed in an anterior-to-posterior direction. One or two more regions separated by sharp gradient reversals existed towards TPJ, but these could be less confidently identified. The surface-based analysis suggested the existence of another area that bordered these strips laterally, in which the tonotopic gradient curved across STG from a medial-to-lateral to a superior-to-inferior direction, ending at STS.

The significance for the mean tonotopic gradient over all subjects to deviate from the null-vector was assessed using Hotelling's T2-statistic. Results were thresholded at P < 0.01 and shown in Figure 4e. Coherent significant clusters were found especially using the volumetric method. On the basis of a false discovery rate (FDR) criterion q < 0.05 [Benjamini and Hochberg, 1995], the corrected thresholds for significance equalled P < 0.0078 and 0.0017 for the volumetric and surface-based analyses, respectively, which was satisfied by 3,082 and 670 vertices in the ROI. Typically, one large cluster was observed to extend along a substantial part of HG, positioned more or less on its crest or slightly on its anterior flank. A second cluster was found in parallel in HS, immediately posterior to HG. The exact location of these clusters only approximately corresponded between the volumetric and the surface-based methods, as became evident from the paired comparison in Figure 5d which shows a relatively poor correlation. Still, the number and general location of these clusters agreed with the strips of cortex that were described in relation to the gradient direction shown in Figure 4d. Some further maxima were found on left PT and near the border between right STG and STS, but these were less significant and appeared more dispersed.

The surface-based results had a noisier appearance than the volumetric ones, in particular for the frequency-related outcomes shown in Figure 4c–e. To explore whether this could be attributed to the fact that a two-dimensional disc-shaped smoothing kernel has a smaller measure than a three-dimensional spherical smoothing kernel of the same diameter, the same analysis was repeated with an increased kernel size (8-mm FWHM) for the surface-based analysis only. The results are shown in Supporting Information Figure 3. As compared to the surface-based analysis in Figure 4, the mean activation moderately decreased, but remained notably stronger than that in the volumetric analysis. At the same time, the corresponding significance levels increased, now typically exceeding those of the volumetric analysis. Apart from a slight reduction in the apparent amount of noise in the frequency tuning maps and the gradient direction maps, the overall tonotopic organisation remained the same. The most striking differences were observed for the significance of the tonotopic gradients, which increased to a comparable level and developed into a similarly coherent pattern as for the volumetric analysis (2,393 vertices satisfied the FDR-corrected threshold P < 0.0062).

Finally, differences between the tinnitus patients and the control subjects were assessed as shown in Figure 6. These included differences with regard to the mean activation level b (Fig. 6a), the frequency index f (Fig. 6b) and the tonotopic gradient g (Fig. 6c). Although weak scattered foci were observed that exceeded the imposed visualisation threshold of P < 0.01, neither the volumetric nor the surface-based analyses revealed significant effects after FDR-correction for multiple comparisons. (No effects survived according to the supplementary 8-mm FWHM surface-based analysis either.)

Figure 6.

The results from the normal hearing controls (CON) and tinnitus patients (TIN) were compared and statistically assessed at the group level. Panels show the significance of group differences regarding (a) mean activation levels b, (b) frequency indices f and (c) gradient vectors g. Except for isolated focal effects that were insignificant after correction for family-wise errors, no differences were observed. (A supplementary surface-based analysis with 8-mm smoothing (data not shown) similarly did not reveal any significant effects.) [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]


In this article, the frequency tuning in human auditory cortex was studied by generating maps that depict the frequency preference of cortical sites in a large number of individual subjects. Group-level averages were constructed on the basis of volumetric and surface-based cortical registrations, and the tonotopic organisation was further explored through maps depicting gradient direction. The statistical significance of activation levels and gradient vectors was assessed and subsequently compared between subgroups of normal hearing subjects and tinnitus patients. In addition, local frequency tuning was compared directly between these groups. Using both volumetric and surface-based analyses, similar evidence for multiple tonotopically organised auditory fields was found, but no differences between tinnitus patients and controls could be detected.

Frequency Tuning Measures

Tonotopic mapping crucially relies on the identification of a ‘best frequency’. In invasive electrophysiological studies, the characteristic frequency is employed, which is defined as that frequency at which a neuron displays the most sensitive threshold. Although this definition serves as a gold standard, it is poorly applicable to non-invasive neuroimaging studies for several reasons.

In fMRI, it is typically not feasible to change stimulus intensities adaptively based on whether a significant suprathreshold response is observed or not. Although real-time fMRI could, in principle, be applied [deCharms, 2008], convergence to a threshold is slow especially if numerous frequencies would need to be assessed using sparse imaging. To overcome this, a best frequency is often defined as that frequency at which a cortical site shows the strongest response to a fixed stimulus intensity. Though both measures must necessarily converge at low stimulation levels, tuning curves typically show an asymmetric broadening at the higher levels that are used in fMRI. Also, complex response behaviours (e.g., bimodal or non-monotonic responses) complicate the interpretation of best-frequency maps [Sutter, 2000]. These problems are further aggravated by the limited sensitivity of fMRI, combined with the noisy scanner environment, which necessitate the use of moderate to high sound levels in order to generate a reliably detectable response. Nevertheless, relative tuning would be expected to be reasonably conserved, that is sites with higher best frequencies contain neurons with higher characteristic frequencies on average. In this study, relatively low-level stimuli were employed in an effort to limit the broadening of tuning curves and complex non-linear response behaviour to a minimum.

Furthermore, as the number of different stimuli that can be incorporated in a conventional block- or event-related fMRI design is limited, the frequency resolution that can be achieved remains crude. Although a spacing in the order of a few octaves suffices to identify low- and high-frequency tonotopic endpoints, this may not resolve plastic shifts in tuning frequency, for example. The computation of a tonotopic gradient in particular is impractical for discrete frequency measures, because it comprises a derivative. Various solutions have been proposed that mostly rely on some parametric response model. These comprise the non-linear fitting of unimodal (e.g., Gaussian) functions [Moerel et al., 2012], Fourier analysis [Striem-Amit et al., 2011; Talavage et al., 2004] or principal component analysis [Langers and van Dijk, 2012]. An advantage of these methods is that a continuous correlate of frequency tuning is obtained instead of a discrete best frequency. Additionally, these measures can be more robust by relying on the measured responses to many frequencies simultaneously, whereas a best frequency measure disregards informative differences between all frequencies except the one with the strongest response.

The frequency index f that was introduced here can be interpreted as an average across the presentation frequencies, weighted by a function of the corresponding responses. A related approach was previously employed by Humphries et al. 2009. Equation (1) comprises the ratio between two contrasts involving response magnitudes. Unlike traditional linear fMRI contrasts, response levels were squared to avoid negative weights. As a result, it is a well-behaved (i.e., continuous, differentiable) function of the response levels βk. This is a desirable property for the computation of gradients that is not equally satisfied for most other measures. It is worth pointing out that for large exponents of βk the outcome converges to the best frequency exactly, whereas for a unit exponent a measure is obtained that effectively resembles what was previously derived in data-driven fashion [Langers and van Dijk, 2012]; the present use of squares can be regarded as a compromise between these two extremes. It is important to note that the employed measure has a tendency to shift best frequencies towards the middle of the employed frequency range when the signal-to-noise ratio is poor. Other measures may tend towards an unpredictable random outcome under such circumstances; this would result in a comparable shift towards moderate frequencies when averaging across subjects.

Frequency gradients were determined, and their direction visualised by means of a circular colour-code. For the surface-based analysis, a natural coordinate system on the sphere was used. The idea to map gradient direction on the cortical surface is not new, although existing studies derived discrete maps by labelling gradients either as parallel or as anti-parallel to a predefined axis of interest [Formisano et al., 2003; Moerel et al., 2012]. The current approach has the advantage that no arbitrary reference direction is required. In the volumetric analysis, gradients were restricted to the axial plane; any component in the z-direction was therefore disregarded. Although this introduces a similarly arbitrary reference plane, this choice avoids the need for a complex multidimensional colour-code that would be difficult to interpret. The axial plane was chosen because the superior temporal surface extends mostly in the x- and y-directions. Similar direction maps were generated earlier [Langers and van Dijk, 2012]. A subtle difference was that this previous report mapped the gradient of a single group-level frequency map, whereas currently the gradients were determined in individuals and subsequently averaged. The current approach has the advantage that mass-multivariate statistics can be performed on the gradient vectors. Until now, this has never been attempted for any topographic map in the brain. Importantly, tonotopic gradient statistics do not assess the local frequency tuning itself, but test the existence of a systematic spatial pattern therein. These are complementary outcomes: a significant difference in frequency tuning does not imply a significant difference in tonotopic gradient, and vice versa.

Tonotopic Organisation

In the last decade, a prevalent view has been that human auditory cortex contains at least two separate tonotopic maps that extend more or less along the main axis of HG. This view was first substantiated in humans by Formisano et al. 2011, who measured frequency preferences along HG and reported a signature high-to-low-to-high profile, as similarly reported for primate cortex [Bendor and Wang, 2005]. The existence of two such abutting gradients was subsequently confirmed by a number of other studies [Moerel et al., 2012; Riecke et al., 2007; Seifritz et al., 2006; Upadhyay et al., 2007]. Interestingly, although a number of recent studies reported similar high-to-low-to-high tonotopic progressions, these were found to be oriented more or less perpendicularly across HG [Da Costa et al., 2011; Humphries et al., 2010; Striem-Amit et al., 2011]. Therefore, two completely contradictory views are held and the issue remains unresolved.

The key results of this study are shown in Figure 7 in relation to the underlying morphology and the apparent parcellation into fields. Both the individual and the group results revealed two high-frequency regions near the medial root of HG: an extensive elongated one on its rostral side in CiS and another smaller one caudally in HS and PT. In between, on HG itself, lower frequencies were represented, culminating in a low-frequency endpoint near its lateral end. This suggests that two distinct tonotopic progressions exist, set at an angle and abutting at their low-frequency endpoint. The gradient direction and significance maps further substantiate the existence of two distinct fields: one labelled rHG, on the crest of HG and extending along its rostral bank into CiS, featuring a low-to-high gradient vector pointing in an (antero)medial direction; and another labelled cHG, on the caudal bank of HG and extending into HS, with a gradient pointing caudally. Gradients were homogeneously oriented within these fields, but showed a sharp transition where they met. The two hemispheres showed highly comparable mirrored patterns.

Figure 7.

A three-dimensional rendering of key results on the group-average semi-inflated temporal lobe surface. Results were copied from the surface-based analysis with 5-mm smoothing, employing colour-codes identical to those shown in Figure 4a,c,d. Notable morphological features include the planum polare (PP), circular sulcus (CiS), Heschl's gyrus (HG), Heschl's sulcus (HS), planum temporale (PT), temporoparietal junction (TPJ), superior temporal gyrus (STG), and superior temporal sulcus (STS). The superimposed dashed lines delineate the approximate outline of two fields on rostral (rHG) and caudal (cHG) HG that could be clearly distinguished on the basis of the tonotopic organisation. Evidence for at least one additional field labelled PT was additionally found further posteriorly. On the lateral side, adjacent to STS, the organisation was ambiguous (DISCUSSION section).

The fact that gradient direction switched between rHG and cHG is not necessarily proof that these must be regarded as separate core fields with distinct functions; one might argue that they together comprise a single field that features a V-shaped gradient. However, as it is hard to find a reason why such a marked boundary would be visible unless it marks some functionally meaningful border, a more likely explanation is that two fields exist that can be distinguished on the basis of their tonotopic gradient direction. In contrast, the current results do not suggest any clear-cut division between the medial and the lateral parts of HG.

Consistent with the proposed parcellation, thresholded fMRI activation patterns in individual subjects often consist of a number of parallel stripes, including a pair extending along the anterior and posterior sides of HG. The fields rHG and cHG may thus correspond with territories that were previously labelled T1b and T2, respectively [Brechmann et al., 2002; Scheich et al., 1998]. Especially the cHG region that coincides with territory T2 was found to respond strongly to tones in this study.

Parcellations into elongated subdivisions on the rostral and caudal sides of HG and HS have also been proposed on the basis of cytoarchitectonic and histochemical criteria, variably labelled BA41/BA42 [Brodmann, 1909], TD/TC/TB [von Economo and Koskinas, 1925], MA/STA/PA [Rivier and Clarke, 1997], TI1/Te1/Te2 [Morosan et al., 2001], MA/AI/LP [Wallace et al., 2002] or Km/Kl [Fullerton and Pandya, 2007]. These schemes are well suited to allow the identification with the present functionally defined fields rHG and cHG that extend in parallel in a similar direction.

Another influential parcellation scheme distinguishes root, core, belt and parabelt areas [Galaburda and Pandya, 1983; Hackett et al., 2001; Morel and Kaas, 1992; Sweet et al., 2005]. Extrapolating from primates, the core region would be expected to contain at least two subareas, named AI and R, that feature separate tonotopic representations. A rostro-caudal subdivision in which field cHG forms the human homologue of area AI and rHG the homologue of R is plausible for a number of reasons. First, in various species tonotopic fields have indeed been identified in a rostral and caudal location relative to each other [Imig et al., 1977; Luethke et al., 1989; Morel et al., 1993]. Second, in primates, these fields abut at their low-frequency endpoint. Third, tonotopic gradients are similarly set at an angle such that the low-frequency endpoint is offset laterally compared to the high-frequency endpoints [Kaas and Hackett, 2000]. Fourth, the core region, and in particular primary area AI, that is the homologue of cHG, is most responsive to tone stimuli [Recanzone et al., 2000]. Fifth, on the basis of MRI markers, primary auditory cortex should indeed be located on the posteromedial part of HG [Sigalovsky et al., 2006]. Despite these arguments, alternative interpretations cannot be completely ruled out because some of the surrounding belt areas also feature tonotopic organisations.

Further posteriorly, evidence was visible for at least a third reversal in gradient direction. Near the lateral aspect of PT, neighbouring posterior STG, a secondary low-frequency endpoint appeared to exist. This is consistent with the most posterior low-frequency endpoint that was reported by Talavage et al. 2005, but that later also appeared in other studies [Humphries et al., 2010; Striem-Amit et al., 2011]. It was much less pronounced than the primary endpoint on lateral HG in the group map, but still clearly distinguishable in a majority of subjects. The gradient direction maps supplied supporting evidence for the existence of such a third field although the pattern was admittedly less clear than for the other two fields. The gradient vectors' significance maps substantiated its existence only in the left hemisphere. It is possible that this posterior field could be less unambiguously delineated because PT is less responsive to tones than HG, and would therefore be less activated in the current experiment. Alternatively, its exact layout may be more variable across subjects. The present data appear to support both possibilities. Additional significantly sound-responsive areas occurred even further posteriorly towards the TPJ. However, no systematic tonotopic organisation was found, possibly due to the weak and inconsistent response in this region.

Another open question is the existence of additional tonotopic progressions on lateral STG and perhaps even STS. Animal studies have identified non-primary belt regions with a tonotopic organisation [Rauschecker et al., 1997]. High-frequency responses were reported far laterally by both Formisano et al. 2011 and Talavage et al. 2005. The topic was recently elaborated on by Striem-Amit et al. 2009 who reported multiple gradients in STG and STS. In the current data, transitions in frequency tuning on STG were apparent. Towards the edge of the activated cluster that neighboured STS, preferred frequencies were found to be moderate, and thus higher than in the nearby pronouncedly low-frequency selective regions more superiorly on lateral STG. This gave rise to a tonotopic gradient that was significant in the right hemisphere according to the volumetric analysis. However, the significant vertices were found at the extreme edge of the cluster, that is in regions where a relatively poor signal-to-noise ratio occurred. Lateral STG and STS are in fact known to be relatively poorly responsive to tones [Hall et al., 2002; Osnes et al., 2010; Rauschecker, 1998]. Also, the fact that subjects were engaged in a non-auditory task and were not required to pay attention to the presented sound stimuli may have led to weak activation in these higher-order areas [Grady et al., 1997; Hall et al., 2000; Petkov et al., 2004]. As remarked above, the employed frequency index tends to assign moderate frequencies to areas with weak responses. Given that the gradient was neither significant in the left hemisphere nor revealed with the surface-based analysis, it seems premature to argue for the existence of a lateral tonotopically organised belt region on the basis of the present findings.

Volumetric Versus Surface-Based Cortical Alignment

It has been suggested that inter-individual differences in the correspondence between cortical morphology and function introduce substantial variations that may obfuscate tonotopic progressions and render them difficult to detect. This study, for the first time, made a pairwise comparison between volumetric and surface-based registration techniques.

Most notably, the activation detected by the surface-based method was stronger than that according to the volumetric method. At the same time, the related statistical confidence levels were more or less the same. Surface-based registration methods may detect stronger neural activation because the extracted individual cortical surface well aligns with the grey matter sheet that hosts the neural processing. In contrast, due to irregularities in gyral morphology (e.g., single, forked or duplicated HG), volumetric techniques may be forced to align grey matter in some subjects with the underlying white matter or with the surrounding cerebrospinal fluid in some other subjects. As a result, vertices in the cortical surface are better guaranteed to coincide with grey matter sites of neural activation, whereas voxels in the brain volume have a higher chance to sample inactive regions from white matter or cerebrospinal fluid in at least some subjects. This may explain the elevated mean activation that was obtained using surface-based registration. At the same time, surface-based smoothing employs a two-dimensional kernel that spans only a sub-volume of the three-dimensional kernel that is used in volumetric analyses (assuming identical FWHM). As a result, volumetric analyses can be more efficient in averaging out stochastic noise. By enlarging the FWHM in a supplementary surface-based analysis, an improved sensitivity could be obtained while still retaining a stronger signal strength. Similar improvements may presumably be achieved by, for instance, averaging across the thickness of the cortical ribbon, or by employing a non-isotropic oblate spheroidal three-dimensional kernel that is aligned to the cortical sheet. These results corroborate previous findings that surface-based analyses provide advantages over volumetric analyses with regard to the obtainable sensitivity to neural activation [Desai et al., 2005; Tucholka et al., 2012].

However, an improved sensitivity to activation does not imply that the registration across subjects of meaningful functional subdivisions within the cortical surface is improved as well. In fact, the significance of local tonotopic gradients was worse for surface-based registrations than for volumetric ones, and this could be remedied only by increasing the smoothing kernel. The reliability of morphological landmarks as a guide to localise cortical fields is known to be limited because gyral patterns and the size and location of primary auditory cortex vary independently to a considerable degree [Rademacher et al., 2001]. The current results suggest that this not only holds for cytoarchitectonically defined auditory fields, but also for functionally defined fields. In the present sample, this was already evident in some of the individual results. For instance, although low-frequency representations were found on lateral HG on average, in various individual subjects the low-frequency endpoint was found elsewhere (e.g., in subject TIN-4121 in Fig. 3 this occurred further posteriorly in lateral HS instead). Of course, the present findings should not be construed as evidence that brain structure and function are unrelated. There is some correspondence between cortical morphology and function [Da Costa et al., 2011]. Yet, the present results indicate that this correlation does not exceed that between volumetric stereotaxis and function.

An open question is whether cytoarchitectonic parcellations relate to functional fields in a more consistent fashion. Grey matter tissue parameters that reflect microscopic features can non-invasively be measured with MRI. For instance, T1-relaxation maps are sensitive to myelination [Glasser and Van Essen, 2011]. As remarked above, such maps seem consistent with the present functional results at the group level [Sigalovsky et al., 2006]. Still, it would be interesting to know whether inter-individual variations in both types of measures are correlated as well. So far, tonotopic fields have hardly been compared directly to grey matter characteristics (see Dick et al. [2012] and Upadhyay et al. [2007]).

Another interesting future prospect is functional co-registration. Functional localisers may be used to map cortical responses to sound attributes, including but not restricted to tonotopic maps of frequency preferences [Woods et al., 2010]. Subsequently, one-on-one correspondences between subjects may be extracted on the basis of the functional organisation of auditory cortex in individuals. However, standardised protocols to obtain such results are lacking and the practicality and consistency of these approaches merits further investigation.

Tonotopic Reorganisation in Tinnitus

This study was motivated originally by a previous report that the tonotopic organisation in tinnitus patients was not found to differ from that in controls with normal hearing [Langers et al., 2012]. This was unexpected based on pathophysiological models of tinnitus that attribute an important role to tonotopic reorganisation [Bartels et al., 2007; Herraiz et al., 2009; Roberts et al., 2010]. Moreover, these findings did not agree with neurophysiological results from animals with induced tinnitus that suggest substantial shifts in tonotopic maps [Engineer et al., 2011; Stolzberg et al., 2011; Yang et al., 2011], or earlier findings in humans that were obtained using magnetoencephalography [Mühlnickel et al., 1998; Wienbruch et al., 2006].

In their work, Langers et al. considered summary statistics of the auditory cortex as a whole, leaving open the possibility that tonotopic reorganisation was present but restricted to subdivisions of auditory cortex only. This may, for instance, be the case if tinnitus involves only particular subdivisions of the auditory pathway, or if the tonotopic reorganisation is limited to a small subset of frequencies from the entire tonotopic range. In this report, complementary analyses were performed on a voxel-by-voxel or vertex-by-vertex basis. This provides sufficient resolution to distinguish between distinct cortical fields, and within such fields it should even allow populations of neurons that are tuned to different frequencies to be separated to some degree. Nevertheless, no significant differences between the two subject groups were found. Of course, one could still argue that the neurophysiological correlate of tinnitus can only be found at a better resolution still. However, this would be hard to reconcile with the available literature on animals as well as humans. Alternatively, focal abnormalities may have remained concealed because the registration across subjects was insufficiently precise to render them detectable in any single vertex. Given the inter-individual variability in tonotopic maps, this indeed seems to be a current limitation.

In most tinnitus models, hearing loss is a crucial factor that triggers functional changes at the central level. The tinnitus patients in this study were atypical in the sense that they did not show substantial hearing loss. Although it can be argued that tinnitus patients without hearing loss are otherwise quite representative of the tinnitus population as a whole [Sanchez et al., 2005], this fact alone may already invalidate the proposed tonotopic remapping mechanism for this subpopulation. It would therefore be of interest to compare more typical tinnitus patients to a control group of subjects without tinnitus but with matched levels of hearing loss. At the same time, this would pose new problems. For instance, how to acoustically stimulate subjects with hearing loss? One could use loudness-matched stimuli or present at a fixed sensation level, but outcomes based on variable stimulus intensities would further deviate from the standard definitions of characteristic or best frequency. Alternatively, one could, for instance, consider subgroups of subjects with sharply edged high-frequency hearing loss, or with unilateral hearing loss. In these subjects, reorganisation may have occurred due to the loss, but they could still be stimulated normally by presenting sound in the intact frequency range, or ear, respectively. These are interesting avenues to explore further. Hopefully, the methodology that was proposed in this study may assist to objectify tonotopic abnormalities in such groups, should they exist.