Roland Frey, Leibniz Institute for Zoo and Wildlife Research (IZW), PO Box 60 11 03, D -10252 Berlin, Germany. E: email@example.com
Similar to male humans, Homo sapiens, the males of a few polygynous ruminants – red deer Cervus elaphus, fallow deer Dama dama and Mongolian gazelle Procapra gutturosa– have a more or less enlarged, low-resting larynx and are capable of additional dynamic vocal tract elongation by larynx retraction during their rutting calls. The vocal correlates of a large larynx and an elongated vocal tract, a low fundamental frequency and low vocal tract resonance frequencies, deter rival males and attract receptive females. The males of the polygynous goitred gazelle, Gazella subgutturosa, provide another, independently evolved, example of an enlarged and low-resting larynx of high mobility. Relevant aspects of the rutting behaviour of territorial wild male goitred gazelles are described. Video and audio recordings served to study the acoustic effects of the enlarged larynx and vocal tract elongation on male rutting calls. Three call types were discriminated: roars, growls and grunts. In addition, the adult male vocal anatomy during the emission of rutting calls is described and functionally discussed using a 2D-model of larynx retraction. The combined morphological, behavioural and acoustic data are discussed in relation to the hypothesis of sexual selection for male-specific deep voices, resulting in convergent features of vocal anatomy in a few polygynous ruminants and in human males.
The parallel evolution of similar vocal anatomies in different taxa can provide insight into corresponding functional and behavioural correlations of bio-acoustic communication (Fitch & Reby, 2001). A remarkable example of such parallel evolution is the male-specific permanent vocal tract (vt) elongation, resulting from the descent and low-resting position of the larynx, which has evolved independently in three mammalian taxa: among primates in man, among cervids in red deer Cervus elaphus and fallow deer Dama dama, and among bovids in the Mongolian gazelle Procapra gutturosa (Fitch & Reby, 2001; McElligott et al. 2006; Frey et al. 2008a,b). In addition, male Mongolian gazelles, similar to male humans, evolved a considerably enlarged larynx compared to conspecific females and to other similar-sized bovid species (Frey & Riede, 2003; Frey et al. 2008a,b). Besides these taxa, a permanently descended larynx exists in both sexes of large felids (Weissengruber et al. 2002) and in a marsupial mammal, the koala Phascolarctos cinereus (Sonntag, 1921).
An increase in larynx size and a vocal tract elongation will affect associated frequency components. A larger larynx can produce a lower fundamental frequency (f0), as its vocal folds are longer and therefore vibrate at a lower frequency compared with the shorter vocal folds of a smaller larynx. In addition, vocal fold vibration depends on the tension of the folds and on subglottic pressure (Titze, 1994). A longer vt produces lower and more closely spaced resonances (formants) in comparison with shorter vocal tracts, as formant frequencies and formant dispersion are inversely related to vocal tract length (vtl) (Titze, 1994; Fitch & Reby, 2001; Fitch & Hauser, 2002). The independence of the acoustic effects of the larynx and of the vt in vocal production is suggested by the source-filtre theory. The source is represented by the vocal folds inside the larynx and the filtre comprises the vt air spaces above the larynx (Fant, 1960; Fitch & Hauser, 2002; Taylor & Reby, 2010).
In both male and female humans, the permanently descended larynx, a shortened soft palate and a highly flexible tongue represent major prerequisites for the evolution of speech (Negus, 1949; Lieberman, 1973, 1984; Fitch, 2000a; Davidson, 2003). Both sexes are equally capable of speaking. However, the male larynx is larger and has a lower resting position in the mid-neck region, thereby generating a clearly visible laryngeal prominence, the so-called Adam’s apple, mostly less pronounced or lacking in females. Ontogenetically, the male-specific additional laryngeal descent develops in the course of puberty, entailing a longer vt in men than in women (Fitch & Giedd, 1999). This appears to increase the danger of ‘swallowing the wrong way’, as the mortality rate from suffocation is higher among adolescent boys than among adolescent girls (Baker et al. 1992). Considering this obvious disadvantage of a low larynx position, there must exist some substantial benefits to allow the evolution of this feature. The probable benefit of the sexual dimorphism of the larynx and vt anatomy in humans is the evolution of the male-specific voice characteristics: a lower f0 (Titze, 1994; Evans et al. 2008) and decreased formants (Rendall et al. 2005). In industrial communities, both these acoustic characteristics affect the impression of dominance of men’s voices as perceived by other men and by women (Puts et al. 2006, 2007) and, in addition, they are attractive for women (Collins, 2000; Feinberg et al. 2005, 2006; Puts, 2005; Saxton et al. 2006). In an African hunter-gatherer community, men with voices of a lower fundamental frequency have greater reproductive success than men with a higher f0. Apparently, the association is mediated by female choice (Apicella et al. 2007; Apicella & Feinberg, 2009).
Similar to humans, the larynx of male fallow deer and Mongolian gazelles is larger and has a lower resting position in males than in females (Fitch & Reby, 2001; Frey & Riede, 2003; Frey et al. 2008b). For fallow deer this sexual dimorphism results in corresponding male-specific voice characteristics: a lower f0 and decreased formants (Reby et al. 1998; Torriani et al. 2006; Vannoni & McElligott, 2007). In red deer, any pronounced sexual dimorphism of the dimensions of the larynx has not been reported so far (Frey & Riede, 2003; Riede & Titze, 2008), contrary to the position of the larynx, which entails different vocal tract lengths in males and females (Fitch & Reby, 2001). However, a pronounced sexual dimorphism of f0 has been demonstrated for the Corsican subspecies of red deer Cervus elaphus corsicanus (Kidjo et al. 2008), unfortunately without information on larynx or vocal fold size. In the North American wapiti, despite a considerable sexual dimorphism of body mass, the f0 has been shown to be almost equivalent in males and females (Feighny et al. 2006) and vocal folds of females are only 15% shorter than in males (Riede & Titze, 2008).
In contrast to humans and most other mammals, which are only capable of slight retractions of the larynx during vocalization (Titze, 1994), the males of the three ruminant species mentioned are capable of additional pronounced intermittent vt elongations by means of short-term larynx retractions (Fitch & Reby, 2001; Reby & McComb, 2003a; McElligott et al. 2006; Frey et al. 2008a). These retractions are specifically related to rutting behaviour and to the production of rutting calls. Rutting calls produced with elongated vt are perceived as more threatening by rival males (Reby et al. 2005) and as more attractive by potential mates (Charlton et al. 2007, 2008).
In contrast to the response to formant frequencies, female red deer during most of the rutting period do not show differential responses to playbacks of male calls with higher or lower f0 variants (McComb, 1991; Charlton et al. 2008). However, females at the peak of their sexual receptivity may prefer male calls with higher f0 (Reby et al. 2010). Therefore, the overall male quality might consist of several aspects (not only body size), which are integrated by oestrous females in the process of their choosing a mating partner. Possibly, the independence of source and filtre might allow a trade-off between formant-related cues to body size and f0-related cues to aggressiveness, arousal, or other indices of male quality.
The goitred gazelle (Gazella subgutturosa) provides another example of parallel evolution of an enlarged larynx and both a permanent and a dynamic descent of the larynx during male rutting roars. Similar to the Mongolian gazelle, the larynx of male goitred gazelles is visible as an impressive laryngeal prominence that had already been accented in the first description of this species (Güldenstaedt, 1780). In the present study, video and audio recordings of free-ranging territorial males were made and analysed to reveal the effects of the enlarged, descended and highly mobile larynx on the acoustics of the rutting roars. In addition, a detailed investigation of the adult male vocal anatomy was conducted. Its reconstruction served to create a 2D-model of the potential mechanism effecting larynx retraction. The combined data of the morphological, behavioural and acoustic analyses are discussed in a comparative perspective and in the context of the theory of sexual selection to suggest a consistent explanation for the several cases of parallel evolution of a sexually dimorphic vocal anatomy both in humans and in ruminants.
Materials and methods
Subjects, site and dates of work
Research was conducted in the Ecocenter ‘Djeiran’ (Uzbekistan, Bukhara region, Kagan district, 39°41′N, 64°35′E) comprising a fenced 5000-ha semidesert area. This territory is inhabited by a population of 600–1200 goitred gazelles (Pereladova et al. 1998) which, in 2009, comprised about 900 individuals. Video and audio recordings of rutting behaviour were made from 17 October to 11 November 2009.
Video and audio recordings
We recorded rutting calls of four wild, adult, unmarked male goitred gazelles (Males 1, 2, 3 and 4) and made video clips of the rutting behaviour of two males (Males 1 and 5) on their respective territories (Video S1). Accordingly, our results are preliminary and based on a small number of individuals. The free-ranging territorial males emitted rutting calls towards females and young males when they were traversing their respective territories, and towards neighbouring dominant male intruders. Usually, a recording session started in darkness at 6:00 h, about 1 h before sunrise, when a target caller started its rutting displays, and lasted 1–3 h until a target caller ceased or decreased its vocal activity spontaneously or after noticing an observer. Apparently, the males were present on their respective territories continuously day and night throughout the study period, as we met some individually recognizable individuals repeatedly on the same territories. All recordings were made from temporary hides located among bushes along the borders of the territories of dominant males. The distance of the animals to the microphone varied from 30 to 150 m, and to the camcorder from 50 to 150 m.
There were only short and exceptional showers of slight rain. As the acoustic and video recordings were done from 6:00 h to about 9:00–10:00 h, ambient temperatures were still low, mostly between 5 and 10 °C. In some days, a slight but steady wind interfered with the acoustic recordings, but for most mornings there was no or almost no wind. On some days, the clouds of expiratory air emitted during a roar were clearly visible. In the studied male territories, most vegetation was low and consisted of bushes, dwarf bushes and grasses, which did not interfere with the acoustic and video recordings.
For the acoustic recordings (48 kHz, 16 bit), we used a Marantz PMD-660 CF recorder with a Sennheiser K6 ME66 (Sennheiser Electronic, Wedemark, Germany) cardioid Electret condenser shotgun microphone (frequency response 40–20 000 Hz). For the video recordings, we used a Sony HDR-HD1E camcorder with conversion lens ×2.0 Sony VCL-HG2037Y (Sony Corp., Tokyo, Japan) and an attached Sennheiser K6 ME66 microphone.
Additionally, in October 2008, we recorded the sounds produced by one excised fresh larynx of an adult male specimen while one of the authors blew air through a plastic pipe fitted to the trachea, thereby eliciting phonation. For the recordings, we used a Panasonic NV-GS250 camcorder (Panasonic Corporation, Kadoma, Japan). Audio tracks of the excised larynx experiment were digitized at 48 kHz for further analysis.
Audio and video analyses
Only calls of good quality, with clearly visible spectral structure and not superimposed with wind or background noise were used for analysis. In the rutting calls (Fig. 1), we measured three acoustic parameters: call duration, pulse rate and the first nine formant frequencies (Fig. 2). Call duration and pulse rate were measured (48 kHz, Hamming window, FFT 1024, frame 50%, overlap 93.75%) using avisoft saslab pro (Avisoft Bioacoustics, Berlin, Germany). Using the standard marker cursor in the main window of avisoft, we measured the duration of 70 roars of three males, 29 growls of three males and 25 grunts of two males. The mean pulse rate was calculated as the inverse value of the mean pulse period, measured in 36 roars of three males and in 26 growls of three males with standard marker cursor in the main window of avisoft. The pulse rate was expected to correspond to f0, i.e. we assumed it was produced by vocal fold opening and closing. Additionally, we measured the durations of 24 bouts of rutting calls recorded from three males. The pulse rate in the audio tracks of the excised larynx sounds, each elicited and maintained by a single prolonged expiration of a human experimenter, was analysed in 18 recordings.
Formant frequencies were measured using linear prediction coding (LPC) with praat v. 4.3.21 (P. Boersma & D. Weenink, University of Amsterdam, Netherlands, http://www.praat.org). Vtl measurements in the dissected specimen served to establish the settings for LPC. Linear prediction parameters for creation of the formant tracks were: Burg analysis, time step 0.04 s; 9–10 formants and maximum formant frequency 3400–3700 Hz. Point values of formant tracks were extracted, exported to excel (Microsoft Corp., Redmond, WA, USA) and the value of each formant of a given call then calculated as the average value of the values of all the extracted points of the track. Formant frequencies were measured from call parts with clearly visible formants produced with an already fully retracted larynx, i.e. where formant tracks reached their minimum values and were nearly horizontal (Fig. 2). The position of formants was verified by superposition on the spectrogram (Fig. 2). We measured formants in 37 rutting roars of three males and in 11 growls of three males.
Applying the model of a straight uniform tube closed at one end, we calculated formant dispersion (ΔF) for each goitred gazelle male using linear regression according to Reby & McComb (2003a). Based upon the formant frequencies of the rutting calls, the maximal vtl during rutting calls was calculated by the equation: vtl = c/2ΔF, where c is the speed of sound in air, approximated as 350 ms−1.
Video analysis techniques mainly followed those of Fitch & Reby (2001), whereas application of the anatomy-based T-line for vt measuring (see below) is an innovation of the current study. Video recordings were fed to a personal computer (PC) and analyzed with dvgate plus v. 2.2.1 (Sony Corp.). To estimate relative vt elongation resulting from larynx retraction during the rutting roars of Male 1, we analysed 50 video single frame pairs of body profiles, each from the same video sequence (Fig. 3). The first frame of a pair shows the resting position of the larynx before onset of retraction and the second shows the larynx at maximal retraction. We tried to select those pairs of profiles depicting the animal in perfect lateral position to the camera. Superimposed horn contours were taken as indication of ideal profiles. Where such profiles were unavailable, we selected images closest to ideal profiles. Estimation of (oral) vtl comprises six steps: (1) insertion of the T-line for determining the approximate position of the choanae, (2) determining the length of the rostral vt section from the choanae to the lips, (3) determining the length of the caudal vt section from the choanae to the vocal folds, (4) relative measuring of rostral and caudal vt sections, (5) calculating absolute measures from the known length of the rostral vt section in the resting position, ascertained by dissections, (6) addition of the absolute values of rostral and caudal section for obtaining the entire vtl. Steps 1–4 were executed in a PC using graphic software (adobe photoshop mac OS 9.5; Adobe Systems Inc., San Jose, CA, USA). Average values were calculated in excel.
In the retracted position, the vertical bar of the T-line points approximately in the direction of the small prominence visible in some single frames, providing an additional control. The decisive proportions of the T-line were established from dissections and from the skull of an adult male. The horizontal bar of the T-line is the distance from the centre of the eye to the most ventral part of the ear opening; the vertical bar of the T-line extends perpendicularly from the middle of the horizontal bar and its length is 75% of the horizontal bar. When using these proportions, the ventral tip of the vertical bar approximates the beginning of the flexible soft palate at the choanae. The rostral vt section was measured from the most rostral border of the closed lips in the resting position and from the mouth angle in the maximally retracted position up to the tip of the vertical bar of the T-line. The caudal section was measured from the tip of the vertical bar of the T-line to slightly behind the laryngeal prominence, where, as ascertained by dissections, the vocal folds are situated.
We used the general linear mixed-effect model (GLMM) with the male subject as a random factor and call type as a fixed factor to compare the values of acoustic parameters between different call types, as Kolmogorov–Smirnov test showed that distributions of all measured parameter values did not depart from normality (P >0.20). Significance levels were set at 0.05, and two-tailed probability values quoted. Statistical analyses were done with statistica, v. 6.0 (StatSoft, Tulsa, OK, USA); all means are given as mean ± SD.
Preliminary macroscopic anatomical dissections of the head and neck region, including the strap muscles, were executed in the Ecocenter Djeiran on two adult males that were exceptionally hunted in May and October 2008. For comparison, the head and neck region of two adult females and six juvenile to subadult specimens were dissected in May and October 2008 and in November 2009, respectively. The adult females and the young individuals of different ages had died either in the enclosures of the Ecocenter Djeiran, where some individuals are kept for scientific investigations, or inside the large fenced area. All specimens were deep frozen as soon as possible after death and stored until the anatomical dissections. Measurements of vtl were taken in the dissected specimens both with relaxed and maximally extended thyrohyoid ligament. Anatomical terms are in accordance with the 5th edition of the Nomina Anatomica Veterinaria (NAV, 2005).
Skeletal parts of several individuals that had died from natural causes in the large fenced area during winter 2007, particularly of one almost complete skeleton of an adult male, were collected, soaked in water, cleaned and dried. Skeletal parts of the dissected head and neck specimens were retained. Results of the anatomical dissections and of the skeletal preparations were photographically documented using a Nikon D70S digital camera (Nikon Corp., Tokyo, Japan). Skulls and major limb bones of the collection of skeletal parts at the Ecocenter Djeiran were photographed. Images were fed to a PC and graphically processed (adobe photoshop 5.5 and CS4; Adobe Systems Inc.).
Reconstructions of the major parts involved in the mobility of the larynx were done in a PC for the resting position and for the maximally retracted position of the larynx. Reconstructions were based on representative video single frames, on photographs of original skeletal parts, and on dissection photos. The components of the vocal tract were graphically reconstructed in a PC (adobe photoshop mac OS 9.5; Adobe Systems Inc.) according to the dissection results. Observable features such as mouth opening, eye, lower contour of the head, ear, hyoid prominence, laryngeal prominence in combination with the neck length were used as landmarks for reconstructing the relative positions of tongue, pharynx, hyoid apparatus, larynx, and the strap muscles in the overlay figures.
Medial crest of arytenoid cartilage
Caudal horn of thyroid cartilage
Rostral horn of thyroid cartilage
I: Lig. thyroh.
Insertum: Ligamentum thyrohyoideum
Insertion of thyrohyoid ligament
I: M. sternmand.
Insertum: Musculus sternomandibularis
Insertion of mandibular portion of sternocephalic muscle
Lamina cartilaginis cricoideae
Mandibular portion of sternocephalic muscle
Nasal vocal tract
Nasal vocal tract
Oral vocal tract
Oral vocal tract
Spina cartilaginis cricoideae
Year-round, male goitred gazelles have a conspicuous laryngeal prominence in the ventral mid-neck region. However, it is only during the rut, mainly in October and November, that larynx size increases while they occupy and defend individual rutting territories and become distinctly vocal (Kingswood & Blank, 1996; Blank, 1998). Providing food resources of varying degrees, these territories attract foraging females and non-territorial males. Mostly, territorial rutting males vocalize during moderate to fast locomotion, even to the point of full galloping. They perform vocal and visual displays (Figs 3 and 4) while approaching females or rival males. The acoustic display comprises raising of the head, the emission of roars, growls and grunts with widely open mouth and concomitant pronounced retractions of the larynx towards the thoracic entrance. Vocalization may start during laryngeal descent or after maximal retraction of the larynx has been completed. Up to the end of the emission of a rutting call, the laryngeal prominence is kept retracted at the caudalmost point of the ventral neck contour before it rapidly ascends again towards its mid-neck resting position. In the video single frames, the dark-coloured rostral portion of the tongue becomes visible. During vocalization the tongue is kept low inside the mouth cavity. When approaching females, head-up rutting calls are regularly preceded by adoption of the female-directed threatening posture. This visual display involves raising or lowering of the head to a horizontal position, maximal extension of the ventral neck region and backward inclination of the auricles (Fig. 4). Territorial marking behaviour comprises the marking of grass stems and individual twigs with secretion from the preorbital glands and urinating and defecating at shallow pits that were made by sideward scraping movements of the forelegs. Urination was effected with the lowered belly above the pit while the hindlegs were sloping backwards. Immediately following urination, defecation occurred in a typical crouching posture with the back flexed and the hindlegs pulled forward and positioned close to the forelegs. This urination-defecation sequence was observed by Marmasinskaya (1996) and closely resembled that described by Walther et al. (1983) for African antilopine species.
During the study period, females fled from chasing males, probably because they were not yet receptive, and thus did not allow the rutting males to start copulatory efforts. Rutting displays were mainly performed during locomotion, when moving towards or chasing a recipient. A rutting male chased his females, trying to keep them from leaving his territory, or to return single foraging females which had left the territory. However, single females within the territory were also chased frequently for no obvious reason. Flehmen behaviour of the territorial male was regularly observed. Flehmen is a behaviour of the adult males of many mammalian species to detect the first signs of oestrus in females by analysing female urine or vaginal secretions in their vomeronasal organs (Estes, 1972; Ladewig & Hart, 1980, 1982; Ladewig et al. 1980; Wysocki et al. 1980; Crump et al. 1984; Døving & Trotier, 1998).
Rutting displays toward non-territorial males, either intruding or simply passing by, consisted of a fast approach, repeated vocalizations and chasing them forcefully off the territory. Encounters between two resident males at the border of their territories often elicited mutual male-directed threatening postures. Major components of this specific visual display are a peculiar slow, stiff walk, a dorsally convex curvature of the neck for achieving a maximally erect head position, broadside body display, and head-down displacement feeding. Fights were observed occasionally. Immediately before a fight, the opponents lowered their heads and stepped back one or two steps so that both were 5–6 m apart. The attacker jumped towards the defender, which attempted to parry the impact by clashing of the horns. The hind legs of the attacker were off the ground at impact. Then the opponents engaged in fierce mutual pushing behaviour with all four legs on the ground, the forelegs widely spread and the hindlegs providing the main thrust for pushing. In this phase, the horns were locked. Front-to-front orientation was retained but the opponents could rotate around the centre formed by the locked horns. Temporarily, horns were unlocked and horns newly clashed without jumping. Toward the end of the fight, pushing was replaced by single clashes of decreasing intensity and the opponents increased the distance in between them by stepping back with their heads still lowered. According to the descriptions of Walther et al. (1983) the fighting behaviour of the goitred gazelle can best be classified as clash-fighting with interspersed push-fighting. One video-documented fight of two neighbouring resident males lasted about 1 min and did not result in any obvious shift of territory borders. Most rutting displays were produced at close range (within 30 m) of recipients. However, when an intruding male attempts to approach the females inside a territory and to display visually and acoustically towards them, the territory owner may cross rapidly from the opposite side of his territory to fiercely chase him off.
Around the onset of vocalization, when the laryngeal prominence starts to descend, a smaller prominence of the ventral neck contour emerges at a level between the eye and the ear. The smaller prominence follows the movements of the laryngeal prominence on a smaller scale. When the laryngeal prominence is stationary for a moment at its caudalmost position and when the ventral neck skin is maximally stretched, the smaller rostral prominence mostly disappears. It re-emerges when the laryngeal prominence begins its return ascent towards the mid-neck resting position. In the resting position of the head, this smaller prominence is not visible, whereas the laryngeal prominence is.
When swallowing is performed, either during deglutition or without food, the laryngeal prominence moves shortly upwards along the ventral neck contour from its low mid-neck resting position towards the throat region and then rapidly returns to its resting position by a corresponding downward movement. No other ventral neck prominence is observed during the swallowing process.
Considering the impressively enlarged larynx of male goitred gazelles, the amplitude of their rutting calls is unexpectedly low. The farthest listening distance of the rutting calls for human observers was 150–200 m; at larger distances the opening of the mouth and the retraction of the larynx was observed but the sound was not audible.
Male goitred gazelles produced three types of rutting calls: roars, growls and grunts, in the order of decreasing loudness. Calls of any of these call types could be produced either singly or in bouts (Fig. 1). The mean duration of one bout was 1.28 ± 0.77 s (n = 24). Bouts consisted of two to four calls (mean 2.67 ± 0.77), and mostly started with a roar (15 bouts, 62%), but could also start with a grunt (seven bouts, 29%) or with a growl (two bouts, 8%). Bouts that started with a roar obligatorily included growls and/or grunts, whereas four pure grunt bouts and one pure growl bout were observed.
The roar consisted of a forced harsh vocal exhalation, often with widely spaced pulses in the second part of a call, audible by human ear and visible in the spectrograms. Rutting male goitred gazelles often had not yet completed larynx retraction before the onset of a call, so that the first part of a vocalization was emitted while the larynx was still descending. Acoustically, this was evidenced by a descending run of formants in the first call part (Fig. 1). Formant frequencies, measured in roar sections corresponding to the maximally retracted larynx, are given in Table 1. The duration of roars was 0.51 ± 0.16 s (n = 70), pulse rate was 22.0 ± 2.58 Hz (n = 36), ranging between 17.2 and 27.8 Hz. This pulse rate coincided closely with the pulse rate of the relaxed vocal folds observed in the excised larynx experiment in both mean value (19.5 ± 2.07 Hz, n = 18) and range of variation (17.1–24.2 Hz). Formant dispersion of roars, calculated by linear regression, was 382 Hz, providing an estimated vtl of 458 mm. Distances between neighbouring formants of roars were uneven, the smallest ones at F1–F2 and F7–F8 and the largest one at F2–F3. Ranges of values (min–max frequencies) were non-overlapping for all neighbouring formants (Table 1).
Table 1. Values (mean ± SD; min–max) for the first nine formants measured in roars and growls of rutting male goitred gazelles, including a comparison by general linear mixed-effect model (GLMM).
Roar (n = 37)
Growl (n = 11)
253 ± 27; 203–304
215 ± 35; 155–276
F1,32 = 11.95; P = 0.002
490 ± 62; 364–691
571 ± 78; 475–731
F1,33 = 12.40; P = 0.001
982 ± 44; 867–1062
1004 ± 110; 855–1193
F1,36 = 1.62; P = 0.21
1401 ± 66; 1240–1543
1458 ± 118; 1225–1571
F1,36 = 7.81; P = 0.008
1839 ± 70; 1695–1989
1815 ± 142; 1607–1999
F1,37 = 0.31; P = 0.58
2198 ± 56; 2101–2307
2176 ± 133; 2038–2378
F1,37 = 0.26; P = 0.61
2525 ± 64; 2401–2649
2554 ± 99; 2454–2670
F1,35 = 3.17; P = 0.08
2835 ± 70; 2687–3062
2911 ± 63; 2849–2992
F1,36 = 4.60; P = 0.039
3222 ± 71; 3095–3437
3312 ± 87; 3246–3410
F1,29 = 6.72; P = 0.014
The growls were relatively prolonged (0.49 ± 0.33 s, n = 29), low-intensity calls with a clearly visible pulsation of 68.3 ± 9.43 Hz (n = 26). In spectrograms, their run of formants was always horizontal, suggesting that growls were produced with fully retracted larynx. This coincides with the observation that growls were rarely produced at the beginning of bouts (Fig. 1). Distances between neighbouring formants of growls were also uneven, the smallest ones at F1–F2, F4–F5 and F7–F8 and the largest one at F3–F4. As in roars, ranges of values (min–max frequencies) were non-overlapping for all neighbouring formants (Table 1). Formant dispersion of growls, calculated by linear regression, was 389 Hz, providing an estimated vtl of 450 mm. The values of the 1st formant of growls were lower, whereas those of the 2nd, 4th, 8th and 9th were higher than those of the corresponding formants of roars (Table 1). Notwithstanding this, the formant values of roars and growls were close to each other, despite the clear structural differences of these two call types.
In addition to roars and growls, rutting males produced grunts, i.e. short exhalations (0.10 ± 0.03 s, n = 25), without visible pulsation. These calls were often emitted in full gallop during a chase or at the end of call bouts that had started with a roar (Fig. 1).
The neck length, measured from the angle of the mandibula to the rostral end of the sternum, was 350 mm. The distance from the angle of the mandibula to the laryngeal prominence was 110 mm.
Vocal tract length
Measurements of vtl in a dissected specimen. The distance between the lips and the choanae in an adult dissected rutting male of probably 6 years of age was about 160 mm, confirmed by comparative measurements of the skulls of adult males, both directly and in photographs. This measure was used to calculate the absolute vtl in the video single-frame pairs (see Materials and methods and below). In a direct measurement of the absolute vtl in the same individual during dissection the entire resting vtl was 320 mm. In indirect measurements in dissection photographs of this individual, i.e. when the vt was still in situ and intact, the entire resting vtl was about 290 mm.
Vtl estimates in video single-frame pairs. Estimates of the vtl in video single frame pairs yielded a resting vtl of 295 ± 25.3 mm and a vtl at maximal elongation of 434 ± 33.6 mm, thus a maximal vt elongation of 147 ± 11.77%.
Soft palate. In the relaxed state, the rostrocaudal length of the well muscularized soft palate of the rutting male was 75 mm. Manually extended, this length increased to about 100 mm. The relaxed soft palate extended caudally up to the level of the occipital condyles. The intra-pharyngeal ostium turned out to be very elastic and could be extended to such a degree that it became slit-like.
Hyoid apparatus. The hyoid apparatus of the adult male goitred gazelle is composed of five serially arranged paired lateral elements: tympanohyoid, stylohyoid, epihyoid, ceratohyoid, thyrohyoid and one unpaired ventral element, the basihyoid. It connects to the ceratohyoids rostrally and to the thyrohyoids caudally (Fig. 5). Three paired elements of the hyoid apparatus were completely ossified: the stylo-, epi- and ceratohyoids. The ossified elements are flexibly interconnected by small pieces of cartilage and connective tissue. The dorsal end of the flexible cartilaginous tympanohyoid connected to the tympanic bulla, its ventral end to the dorsorostral end of the stylohyoid. The rostrocaudal mobility range of the tympanohyoid was limited by a smooth and flattened, ventrally diverging sector-like depression of about 50° in the lateral wall of the tympanic bulla. The most flexible points of the hyoid apparatus are situated between tympanohyoid and tympanic bulla and between cerato- and basihyoid. The pivotal basihyoid stands out by its pronounced dorsoventral height and by being entirely cartilaginous. Rostrally, lingual muscles insert on it; caudolaterally, sterno- and omohyoid muscles insert on it. Laterally, the insertion of the stylohyoid muscle lies inbetween. Caudomedially, it serves for insertion of the hyoepiglottic muscle. The thyrohyoid appeared to be relatively long, as it stood on a dorsally directed elevation of the basihyoid. Including this foot, the thyrohyoid length was 70% of the stylohyoid length. Almost the entire caudal half of the thyrohyoid consisted of cartilage (Fig. 5).
Thyrohyoid connection. The caudal end of the thyrohyoid connects to the rostral horn of the thyroid cartilage via the highly resilient thyrohyoid ligament, i.e. there is no thyrohyoid articulation. In the specimen dissected during the rut, the relaxed length of the ligament was about 36 mm. Comparing the resting state of the larynx with its assumed maximally retracted position at the thoracic entrance in the dissection photographs, the length of the maximally extended thyrohyoid ligament in the live animal should range between 200 and 220 mm, suggesting a sevenfold extension. According to our 2D-model of larynx retraction (see below), the length of the thyrohyoid ligament at maximal retraction of the larynx is 210–220 mm, also implying an approximate sevenfold extension.
Larynx. The overall shape of the larynx is dominated by the considerable dorsoventral height of the thyroid cartilage at the level of the laryngeal prominence, caused by formation of a thyroid bulla. The dorsoventral height of the cricoid cartilage is considerably less, thus causing an obtuse angle of about 145° between the ventral contour of the thyroid cartilage and the longitudinal axis of the trachea or, in other words, an oblique entry of the trachea into the larynx (Fig. 6A).
The most spectacular feature of the larynx is a long cartilaginous spine extending caudally from the cricoid arch along the medioventral surface of the trachea. It ended in a rounded anchor-like tip at the level of the 13th tracheal cartilage. Not less spectacular, there is a medioventral cartilaginous plate bearing formed by the 14th to 17th tracheal cartilages. It consists of a small triangular plate that is obliquely elevated in caudal direction and continues into some sort of a cartilaginous rod connecting the 14th to 17th tracheal cartilages ventrally (Fig. 6B). That smooth plate is in direct contact with the broadened tip of the spine. The spine is strongly connected to the medioventral surface of the trachea by resilient connective tissue. In addition, there are triangular slips of connective tissue attaching the cricoid spine to the lateral wall of the first five to six tracheal cartilages (Fig. 8A).
Compared to overall proportions of the arytenoid cartilage, the vocal process is short, measuring 9–10 mm, from the ventral end of the corniculate curvature to the tip. On the medial surface of the arytenoid cartilage there is a distinct longitudinal crest about 12 mm in length (Fig. 7A).
The overall length of the larynx in the two dissected adult males, from the tip of the epiglottis to the caudal tip of the spine, was 170 and 132 mm, respectively. The length of the spine, measured from the caudal edge of the cricoid ring, was 58 and 47 mm, respectively.
The rostrocaudal length of the epiglottis, from tip to base, was 36 mm, and its maximal transverse width was 30 mm. The hyoepiglottic muscle originated from the basi- and ceratohyoid and inserted into the ventral half of the lingual surface of the epiglottis. The region between the epiglottis and the arytenoid cartilages, i.e. the laryngeal vestibulum, appeared to be extremely resilient.
The dorsoventral orientation of the vocal fold is perpendicular to the longitudinal axis of the trachea and about 60° to the ventral edge of the thyroid cartilage (Fig. 7). Dorsally, the vocal fold attaches to the vocal process of the arytenoid cartilage, and ventrally to the dorsal surface of the thyroid cartilage, at and mostly caudal to the laryngeal prominence. Vocal fold length, measured in one adult non-rutting male (May) and in one adult rutting male (October) was 32 mm in both. With the exception of the rostral 2–3 mm, the vocal fold does not protrude into the laryngeal lumen as a fold but as a broad and flat mucous membrane elevation fused to the lateral wall of the larynx. The rostral edge of the vocal fold is straight, whereas its caudal edge is convex, the apex pointing towards the centre of the trachea. Rostrocaudal diameters of the vocal fold at the level of the vocal process, the centre of the trachea and the laryngeal prominence, were 16, 18 and 7 mm, respectively. The medial surface of the vocal fold is delicately plicated dorsoventrally.
Within the thyroid bulla, the laryngeal vestibulum, the vocal fold region and the rostral part of the subglottic cavity were supported by a thick, oval connective tissue vocal pad, the short axis of which was oriented dorsoventrally and the long axis rostrocaudally. Length measures were 25 and 35 mm, respectively. The transverse diameter of the vocal pad was about 12 mm. In the vocal fold region, an oval area inside this vocal pad of somewhat tougher consistency could be identified. Its long axis was oriented dorsoventrally and its short axis rostrocaudally. Length measures were 25 and 18 mm, respectively. Ventrally, this central area connected to the thyroid cartilage. Dorsally, it extensively attached to the arytenoid cartilage by virtually enwrapping its short vocal process (Fig. 7B).
The medial surface of the vocal pad adjacent to the mucous membrane was almost flat. In contrast, its lateral surface bulged out considerably, so that the portions of the thyroarytenoid muscle (see below) were forced to take a laterally convex course to surround the vocal pad (Fig. 8). A laryngeal ventricle was lacking.
Origins and insertions of the intrinsic laryngeal muscles are listed in Table 2. The thyroarytenoid muscle is remarkable as it consisted of three separable portions (Fig. 8B). The fibres of the weak rostral portion converged strongly towards their insertion, whereas the fibres of the powerful middle and caudal portions took a parallel course. Between origin and insertion, the fibres curved in a pronounced manner around the laterally bulging vocal pad and thus, followed an arched course in the relaxed state.
Table 2. Origin, insertion and basic function of those vocal tract muscles with major relevance for laryngeal mobility in adult male goitred gazelle (Gazella subgutturosa).
Angle of stylohyoid
Pulls angle of stylohyoid dorsocaudally
Symphysis et synchondrosis of mandible
Lateroventral surface of stylohyoid
Along lateral edge of lingua up to tip of tongue
Retracts the tongue
Angle of stylohyoid
Pulls basihyoid dorsally
Lingual surface of epiglottis
Changes shape of epiglottis and pulls it ventrally
Lateral surface of thyroid cartilage
Protracts the larynx
Lateral surface of cricoid cartilage
Protracts the larynx
Deep fascia of the neck, transverse processes C4–C5
Fixes basihyoid and pulls it caudally
Either pulls hyoid caudally or larynx rostrally
Pulls basihyoid caudally
Pulls larynx caudally
Caudoventral edge of auricular base
Contralateral muscle in the throat region
Pulls auricle ventrally, compresses throat region
The trachea. The distance from the caudal edge of the cricoid ring to the sternal manubrium, corresponding to the length of the cervical portion of the trachea in the resting position of the larynx, was about 200 mm. The external diameter of the trachea, from the 5th tracheal cartilage onwards caudally, was uniformly 25 mm. However, the diameter of the first four tracheal cartilages was only 75–80% of that measure, causing a conspicuous constriction of the trachea at the laryngeal/tracheal junction (Fig. 6). In addition, the first four tracheal cartilages appeared to be caudally inclined at an angle of 20°, approximately, relative to the successive ones. This narrow portion was covered dorsally by the cartilaginous thin and flexible, dorsally concave end of the cricoid lamina. From the 5th tracheal cartilage onwards, the flattened, thin and flexible ends of the tracheal cartilages contacted and overlapped to form a mediodorsal ridge that reduced in height caudally. The ridge is most prominent in the cranial half of the cervical portion of the trachea (5th to 20th tracheal cartilage). The first tracheal cartilage is roughly triangular in shape, the point directing ventrally.
Musculature. This section is not intended to provide a complete anatomical account of the goitred gazelle’s vocal tract musculature but rather to briefly describe and review the relevant functions of the muscles involved in larynx retraction (see also Table 3). Subsequent information is based on our own dissection of goitred gazelle and on results from domestic ruminants (Nickel et al. 1987).
Table 3. Origin and insertion of the intrinsic laryngeal muscles of the adult male goitred gazelle (Gazella subgutturosa).
Cricoid arch and laterally from broadened rostral end of cricoid spine
Caudal edge of thyroid lamina and ventrolaterally to caudal horn of thyroid cartilage
M. cricoarytenoideus dorsalis
Cricoid lamina, mostly in exception from its thin and flexible, dorsally concave caudal part, dorsal to cricothyroid connection
Dorsocaudal surface of muscular process of arytenoid cartilage
M. cricoarytenoideus lateralis
Dorsorostral half of cricoid arch
Triangular ventrolateral surface of muscular process of arytenoid cartilage
M. arytenoideus transversus
Rostrodorsal edge of muscular process of arytenoid cartilage
Dorsal surface of thyroid cartilage, rostral to thyroid prominence
With strongly converging fibres to the apex of the arytenoid cartilage, rostral to M.arytenoideus transversus
Dorsal surface of thyroid cartilage, rostral part of thyroid prominence
Ventrorostral surface of muscular process of arytenoid cartilage
Dorsal surface of thyroid cartilage, caudal part of thyroid prominence
Ventrocaudal surface of muscular process of arytenoid cartilage
The occipitohyoid muscle pulls the angle of the stylohyoid dorsocaudally and, thus, assists in backward tilting of the hyoid apparatus. The geniohyoid muscle protracts the basihyoid. The styloglossus muscle retracts the tongue. The stylohyoid muscle pulls the basihyoid dorsally. The caudal constrictor muscles of the pharynx (M. thyropharyngeus; M. cricopharyngeus) assist in protraction of the larynx. The hyoepiglottic muscle changes the shape of the epiglottis and pulls it ventrally, thereby increasing the size of the laryngeal entrance. The omohyoid muscle, in ruminants connected to the cervical fascia, fixes the basihyoid and pulls it caudally. The thyrohyoid muscle, depending on the contraction status of other muscles, either pulls the hyoid apparatus caudally or the larynx rostrally. The sternohyoid muscle pulls the basihyoid caudally. The sternothyroid muscle pulls the larynx caudally.
Reconstruction of major components involved in vocal tract elongation, 2D-model of larynx retraction
Based on the results given in the preceding sections, this section presents a graphic 2D-model (Figs 9A–C and 10A–C) and the inferences derived from it. Owing to head raising, contraction of the occipitohyoid muscle, and caudal pull on the thyrohyoid, the hyoid apparatus is somewhat tilted caudally and, synchronously, the angle between ceratohyoid and thyrohyoid increases during larynx retraction. This will push the basihyoid against the skin of the throat region and produces the small, visible prominence. It may disappear during maximal extension of the ventral neck region by the counteracting maximal tension of the ventral neck skin. Interconnecting cartilaginous and connective tissue elements of the hyoid apparatus provide the necessary resilience for these conformational changes.
The height of the basihyoid and the high flexibility between the ceratohyoid and the basihyoid suggest an angle lever function. The angle lever consists of one short lever (the elongated basihyoid) and two long levers (the thyrohyoids). The transverse rotational axis runs through the cerato/basihyoid connection. The double long lever rotates caudally during larynx retraction, ultimately caused by contraction of the sternothyroid muscle. Compared to species with a typical, lower basihyoid, re-rotation of the thyrohyoids is improved and stronger support is lent to contraction of the thyrohyoid muscles when returning the larynx to its resting position. Both the caudal cartilaginous parts of the thyrohyoids and the cartilaginous nature of the basihyoid may dampen the effects of abrupt strong contractions of the sternothyroid, thyrohyoid and sternohyoid muscles on the hyoid apparatus.
Compared to the other ruminant species with retractable larynx, the mechanical stress on the hyoid apparatus of male goitred gazelles might be higher because rutting calls can even be emitted at full galloping speed. This may require an improved pull-and-rebound mechanism for the larynx synchronous to elevated locomotory work of the body musculature.
Following the caudoventral movement of the basihyoid at maximal retraction of the larynx, the tongue elongates correspondingly and its torus decreases in height. This flattening of the tongue is supported by raising of the head and lowering of the jaw.
For oral emission of the rutting calls, muscles around the intrapharyngeal ostium contract and the soft palate touches the dorsal pharyngeal wall, sealing off the nasal vt so that the phonatory air stream is exclusively guided through the oral vt. At maximal retraction, the larynx has descended caudally up to the thoracic entrance. This will briefly disrupt the contact between soft palate and epiglottis.
Presumably, pronounced retraction of the larynx will reduce the angle between the ventral contour of the thyroid cartilage and the trachea by 5–10°, leading to a zone of major bending stress at the rostral and caudal cricoid connections. Rostrally, bending stress is reduced by a rotational movement in the cricothyroid articulation. Dorsal flexion of the trachea will tend to rotate the cricoid arch away from the thyroid cartilage, thus extending the cricothyroid ligament. This movement in the sagittal plane suggests a relaxation of the vocal folds at vocalization that is in accordance with the acoustical results (see Discussion). Caudodorsally, a certain relief from bending stress is provided by the oblique entry of the trachea into the larynx and by the thin and flexible caudal part of the cricoid lamina, as it can easily yield dorsally. Possibly, the dorsal flexion of the trachea is also facilitated by the flexible, overlapping dorsal ends of the tracheal cartilages. At the same time, the cricoid spine, which is firmly fixed to the trachea by connective tissue, will prevent excessive bending of the trachea caudoventrally. When the larynx reaches a point close to the sternum and the trachea becomes dorsally flexed, this will require a certain rostrocaudal sliding, which is supported and guided by a plate bearing along four tracheal cartilages for the tip of the cricoid spine. In addition to dorsal bending at maximal retraction of the larynx, the trachea will be pushed caudally into the thorax and between the lungs by about 150 mm. This movement is accompanied by a powerful exhalation, at low ambient temperatures visible as a cloud of condensing water vapour emitted simultaneously with the call.
The other ruminant species with retractable larynx have a straight or less oblique entry of the trachea into the larynx. Possibly, they achieve the necessary bending at maximal retraction of the larynx by a different mechanism, e.g. by an increased flexibility at the rostral end of the trachea. This could be achieved by slender, tightly spaced and obliquely oriented tracheal cartilages, as observed in the Mongolian gazelle (Frey & Riede, 2003).
Relative positions of hyoid and laryngeal prominence at maximal extension of the ventral neck region suggest a roughly sevenfold extension of the thyrohyoid ligament at maximal retraction of the larynx. In addition, similar pronounced extensions are postulated for the pharynx, the thyrohyoid membrane, and for longitudinally coursing blood and lymph vessels rostral to the larynx. Nerves of this region would have to adapt to this extreme resilience by permanent elongation and an undulating course.
Extreme mobility of the larynx is effected by concerted action of the vocal tract muscles. The main retractor is the sternothyroid muscle. To achieve maximal retraction of the larynx, this muscle has to contract considerably, entailing a length reduction of roughly 60%. At the same time, the caudal constrictors of the pharynx, the thyrohyoid muscle and the hyoepiglottic muscle experience maximal extension, 60, 65 and 85%, respectively. In other words: when returning the larynx to its resting position, these muscles have to reduce their length accordingly. Length changes of the omohyoid muscle are less dramatic, roughly 35%. This muscle is extended during the retraction phase. Remarkably, the sternohyoid muscle, the initial portion of which is fused to the sternothyroid muscle, is extended during the contraction phase of the sternothyroid muscle and contracts during the extension phase of this muscle. Length changes of the sternohyoid muscle are roughly 20%. Like the occipitohyoid muscle, the stylohyoid muscle contracts in the retraction phase of the larynx. It appears to control dorsoventral excursions of the basihyoid by length changes of roughly 25%. This action might be assisted by the parotidoauricularis muscle, which also contracts during the retraction phase of the larynx, visibly indicated by backward inclination of the auricles. The sternomandibularis muscle is extended in the retraction phase of the larynx by about 18%. It is supposed to protect and stabilize the vocal tract laterally, particularly during fast locomotion (for superficial, middle and deep aspects of the reconstruction see Figs 9A–C and 10A–C).
Compared to domestic bovids of similar body size, such as sheep and goats (Nickel et al. 1987), the larynx of the male goitred gazelle has been evolutionarily enlarged. It occupies a low-resting position in the mid-neck region and can be additionally retracted by about 150 mm, i.e. by 47% of the vt resting length, during male rutting calls. The resting length of the vt was 295 mm and the maximally elongated vtl 434 mm, respectively. This value coincides well with a maximal vtl of 450–458 mm, calculated on the basis of formant dispersion in the rutting calls. A decisive feature for the highly mobile male larynx is the ligamentous thyrohyoid connection and an extremely resilient thyrohyoid ligament, capable of reversible length changes of seven to eight times during maximal laryngeal retraction and subsequent laryngeal ascent. In addition, the pharynx and its accompanying structures must be of comparable resilience to adjust their momentary length to the variable positions of the larynx. The small, intermittently visible, and the large permanent prominence in the ventral neck region were identified as being produced by the most ventral part of the hyoid apparatus (basihyoid) and the larynx, respectively. Our 2D model suggests a caudal tilting of the hyoid apparatus synchronous to larynx retraction and a rostral tilting during the return of the larynx to its resting position. The model further suggests a caudal pushing of the trachea towards the mediastinum by about 150 mm at maximal retraction of the larynx. Eventually, this movement is facilitated by a strong exhalation necessary for emitting a roar. Compression forces on the trachea might be absorbed by decreasing distances between the tracheal cartilages and by the resilience of the lungs. Vocal tract elongation is less expressed in the goitred gazelle compared to 100% of resting vt length in red deer (Fitch & Reby, 2001) and to 52% in fallow deer (McElligott et al. 2006), but more expressed compared to 30% in the Mongolian gazelle (Frey et al. 2008b). Notwithstanding this, the larynx of the male goitred gazelle is retracted down to the thoracic aperture, as in red deer. Interspecific differences of vt elongation might result from differing resting vt lengths.
A low-resting position of the larynx was previously believed to be unique to humans. However, with the inclusion of the goitred gazelle, this feature has now been found in the males of four polygynous ruminant species (Fitch & Reby, 2001; McElligott et al. 2006; Frey et al. 2008a). A low-resting position of the larynx has also been documented in both sexes of large felid species (Weissengruber et al. 2002) and most probably also in both sexes of the koala (Sonntag, 1921; Unwin, 2004), but the current study will not deal with these species. As larynx size and vocal tract length are sexually dimorphic both in the four ruminant species and in humans, a lowered resting position of the larynx in human males and in the males of these four species may have evolved independently by similar sexual selection pressures, namely male–male competition and female choice. The evolution of this sexual dimorphism of the human vt may, in fact, have preceded the evolution of speech, perhaps as some sort of preadaptation to it (Fitch, 2000a, 2002; Fitch & Reby, 2001). This would contradict the hypothesis of speech development being the selection pressure for the evolution of the descended larynx in humans (Lieberman, 1973, 1984; Davidson, 2003).
Supposedly, the evolutionary descent of the larynx started from a preadaptive plateau based on naturally occurring slight interindividual variations in larynx position and on interindividually differing slight retractions of the undescended larynx during vocalization. These slight momentary descents retracted the epiglottis and the entrance of the larynx from their ‘intra-narial’ respiratory position to guarantee a complete emission of the phonatory air stream through the oral vocal tract (Fitch, 2000b). These slightly elongated vocal tracts resulted in acoustical exaggeration of body size and may have been more effective for deterring rival males and attracting receptive females. This first round of evolutionary transformation among males favoured rapid evolution of a low-resting position of the larynx in the mid-neck region. In this position, the lock-up between soft palate and epiglottis for respiration via the nose can be retained by evolution of an elongated soft palate, as in red and fallow deer, or an enlarged epiglottis, as in Mongolian gazelle, or a combination of a moderately elongated soft palate, a resilient laryngeal vestibulum and a larynx position, in which about two thirds of the larynx are located rostral to the laryngeal prominence, as in the male goitred gazelle.
Naturally occurring variation of the resilience of the thyrohyoid ligament and of the pharyngeal wall in combination with the presence of strap muscles connecting the hyoid apparatus, the larynx and the sternum to each other, seems to have allowed the evolution of pronounced intermittent short-time retraction of the larynx. This extended the evolutionary transformation from long-term permanent anatomical vocal tract elongation by descent of the larynx to short-term reversible physiological vocal tract elongation by muscular action.
Ultimately, lower resting position and pronounced temporary larynx retraction had direct positive effects on the reproductive success of those males equipped with these features (Reby & McComb, 2003a; Briefer et al. 2010). We may conclude, therefore, that strong sexual selection drove permanent laryngeal descent and momentary vt elongation by laryngeal retraction, up to a point where further vt elongation was counterselected by natural selection. As a consequence, all males were forced to evolve a similar degree of anatomical vt transformation and the acoustic features changed accordingly.
As in other male ruminants with a retractable larynx (Fitch & Reby, 2001), the outstanding acoustic effect of larynx retraction in the rutting calls of male goitred gazelles consists in a lowering of the formants (Fig. 1). Therefore, we reasonably propose that lowered formants, as in the aforementioned species, may be used for acoustic exaggeration of own body size (Fitch & Reby, 2001; Reby & McComb, 2003a; McElligott et al. 2006). This could be advantageous, as considerable parts of the rut displays occur in darkness or at twilight, when it is difficult to estimate the actual body size of a caller visually. Apart from retraction of the larynx and concomitant vt elongation, the enlarged larynx with thicker vocal folds comprising large connective tissue vocal pads may generate a noisy structure of the roars and, thus, allow a very clear accentuation of the formants. However, this is coupled to a low amplitude of the emitted vocalizations. A poor propagation rate of male rutting calls, produced by a strongly enlarged larynx, has also been documented for Mongolian gazelles, which, similarly to male goitred gazelles, approach a recipient to perform their rutting displays at close range (Frey et al. 2008a). Possibly, high amplitudes would require greater subglottal pressures to produce and this is likely to increase f0. Consequently, the low f0 and pulsed nature of the calls of Mongolian and goitred gazelles may be hard to produce at higher amplitudes. From an evolutionary perspective, sexual selection towards the acoustic expression of male body size as inferred for the rutting roars of goitred gazelles resulted in maximum accentuation of lowered formants, similar to red and fallow deer stags (Fitch & Reby, 2001; McElligott et al. 2006; Briefer et al. 2010). Uneven distances between formants and a closer spacing of the first two formants, as ascertained in this study for the rutting roars of adult male goitred gazelles, have also been reported for adult male red deer (Reby & McComb, 2003a,b), fallow deer (McElligott et al. 2006) and saigas (Saiga tatarica; Frey et al. 2007). We can reasonably assume, therefore, that formants may shift due to non-uniformities of the vt, as was suggested for the alarm calls of Diana monkeys (Cercopithecus diana) by applying a computational model involving acoustic and anatomical data (Riede et al. 2005).
While formants may provide cues to body size in goitred gazelles, both to potential mates and to rivals, the laryngeal f0 may provide cues to the caller’s hormonal status. Compared to formant frequencies, the fundamental frequency of mammalian calls is not as strictly related to body size, because larynx size may increase independently from the rest of the body (Fitch & Hauser, 2002). Moreover, a wide range of vocal tuning via more or less tensed vocal folds may occur (e.g. Riede & Titze, 2008). Interestingly, female humans prefer men’s voices low in f0, although this does not correlate directly to male physical condition (e.g. Collins, 2000), and low f0 enhances human male reproductive success (Apicella et al. 2007; Apicella & Feinberg, 2009). Female preferences for a low fundamental frequency, i.e. for males with larger larynges, may relate to a higher androgen status of a caller. The size of the vocal folds within the larynx is under strict control of androgens in human males (King et al. 2001; Puts, 2005) and fundamental frequency negatively correlates with androgen levels in men (Dabbs & Mallinger, 1999; Evans et al. 2008). In fallow deer bucks, the f0 of rutting groans increases with ageing, probably as the result of decreasing androgen status, and shows negative correlation with male quality, determined via dominance rank and number of matings (Vannoni & McElligott, 2008; Briefer et al. 2010). In male Mongolian gazelles, the size of the larynx increases prior to the rut compared to the non-rutting period (Frey et al. 2008b). In two adult male goitred gazelles, kept in enclosures, repeated measurements of the larynx in the non-rutting and rutting periods revealed a 50% size increase during the rutting period (our unpublished data). Laryngeal tissues are highly receptive to steroid hormones, especially to androgens in males (Tuohimaa et al. 1981; Aufdemorte et al. 1983; Newman et al. 2000). Therefore, the seasonal size increase of the larynx in gazelles (Blank, 1998; our unpublished data) could result from a mass increase of the vocal musculature and an elevated tumescence of the ventral neck region under raised levels of male sexual hormones.
The sexually dimorphic descent of the larynx imposes costs on male humans, namely the danger of choking and suffocation as a consequence of the enlarged gap between soft palate and epiglottis in the resting position of the larynx (Baker et al. 1992; Davidson, 2003). In contrast, results in red and fallow deer (Fitch & Reby, 2001 and own dissections), in the Mongolian gazelle (Frey & Gebler, 2003; Frey et al. 2008a,b) and our 2D model of larynx retraction in the male goitred gazelle suggest that, in these ruminant species, the contact between the soft palate and the epiglottis is maintained while the larynx is suspended in the resting position. Therefore, a coherent hypothesis for the position of the epiglottis in all four ruminant species capable of larynx retraction emerges: the epiglottis is retained within the intrapharyngeal ostium while the larynx is in the resting position, but the contact between epiglottis and soft palate is lost while the larynx is pronouncedly retracted.
Rutting roars of male goitred gazelles do not show a clearly visible fundamental frequency but represent noisy, atonal calls. The vocal fold length of 32 mm, ascertained in male goitred gazelles, corresponds to that of 4-year-old male Rocky mountain elk (Cervus elaphus nelsoni), which are of a seven- to eightfold greater body mass (Kingswood & Blank, 1996; Bender et al. 2003; Riede & Titze, 2008). Considering this vocal fold length, a simple string formula predicts a fundamental frequency lower than 50 Hz (Riede & Titze, 2008). However, vocal fold mass, elasticity and tension also influence f0 and this influence is difficult to consider. We are, therefore, reasonably assuming that the weak, widely spaced pulsation of 22 Hz, observed in many roars of rutting male goitred gazelles, represents their fundamental frequency. Among subspecies, male red deer show a great variation in fundamental frequency, ranging from 40 Hz to more than 1000 Hz (Reby & McComb, 2003a,b; Feighny et al. 2006; Kidjo et al. 2008; Riede & Titze, 2008). The lower hearing thresholds of other ruminant species extend to frequencies close to or in the range of the pulse rate of goitred gazelle rutting roars (Heffner & Heffner, 1983, 1990, 2010; Flydal et al. 2001). The large, inflated tympanic bullae of goitred gazelles (Kingswood & Blank, 1996 and own observations) also point to good low frequency hearing (Moore, 1981; Geisler, 1998; Huang et al. 2002).
The concordance of the pulse rate of the rutting roars with the pulse rate of the sound produced in the excised larynx experiment suggests that these pulses are evoked by vibration of the relaxed voluminous vocal folds. Moreover, the pulse rate of the sound produced by the excised larynx could be changed depending on the strength of air blown through it, implying its dependence on subglottal air pressure. This is reminiscent of the relationships existing in the pulsed phonation of humans and non-humans, where the fundamental frequency is in direct proportion to subglottal pressure (Riede & Zuberbühler, 2003). A low pulse frequency at vocalization is further suggested by the large connective tissue pads, laterally attached to the vocal folds. Attachments of the tough central portion of the connective tissue pad suggest that this represents the homologue of the vocal ligament that has considerably increased in thickness. Its extensive attachment to the vocal process further suggests the involvement of considerable forces in controlling of the dorsoventral tension and the relative mediolateral position of the cushion-like, thick and heavy vocal folds.
The pulse rate of growls (68 Hz) is probably produced with tensed vocal folds. Compared to roars, the higher f0 of growls may result from motivational state and may reflect higher stages of arousal of rutting male goitred gazelles. Consistently, in male baboons with a high dominance status, calls are higher in f0 (Fischer et al. 2004). However, the f0 of rutting groans in fallow deer bucks becomes higher with ageing, when dominance rank and number of matings decrease (Vannoni & McElligott, 2008; Briefer et al. 2010). In red deer stags, reproductive success correlates with the minimum f0 of their rutting calls, but not with mean or maximum f0 (Reby & McComb, 2003a).
There is no close phylogenetic relationship between cervids and bovids and the majority of both cervid and bovid species neither possess a descended larynx or are capable of pronounced larynx retraction. Moreover, the morphological features involved in male vt transformation differ considerably between the four species concerned. Considering this situation, we may infer that among polygynous ruminants, strong sexual selection for vocal size exaggeration and hormonal-related cues to male quality encoded as low formants and low fundamental frequency, elicited parallel or convergent evolution of vocal anatomy in different taxa. In other words, differing morphological adaptations have evolved to achieve basically similar acoustic functions. The most spectacular example among bovids in this respect is the saiga, in which the male rutting calls are emitted via the nose, which is more strongly enlarged than in females, and vt elongation is not effected by larynx retraction but by stereotyped dynamic extension of the trunk-like nasal vestibulum (Frey et al. 2007).
Compared to male Mongolian and male goitred gazelles, which exhibit a strongly enlarged larynx [male/female larynx size ratio in Mongolian gazelle: about 2/1, Frey & Riede, 2003; in goitred gazelle (preliminary): 1.65/1], the larynx of red deer stags is not enlarged (male/female larynx size ratio: 1.2/1, Frey & Riede, 2003; Riede & Titze, 2008). Indeed, the small male/female size dimorphism of the larynx in red deer almost equals that of other cervid species incapable of any impressive vocal behaviour, e.g. mule deer (Odocoileus hemionus; male/female larynx size ratio: 1.14/1; Riede et al. 2010). The strongly enlarged larynx of the males of Mongolian and goitred gazelles is utilized for the production of low-amplitude close-range calls. In contrast, the relatively small larynx of red deer stags can produce extremely loud, far-ranging calls of a much higher propagation rate in comparison with both gazelle species. In addition to acoustic cues, a large larynx may represent a visual cue to male quality in the two gazelle species. In male Mongolian gazelles, the enlarged larynx is additionally accented by contrastive fur coloration during the rut (Frey et al. 2008a,b). Reasonably assuming comparably high androgen levels in red deer stags and other polygynous ruminants during the rut, the prominent differences in larynx size may result from variation in the density of androgen receptors between species.
Rutting male goitred gazelles have a permanently descended, enlarged larynx located in the mid-neck region. For emission of their rutting roars, males retract the larynx down to the thoracic aperture, entailing a momentary vt elongation by almost 150%. Rutting male goitred gazelles produce three types of close-range vocalizations, roars, growls and grunts, all involved in courtship display. Caudal movement of the larynx during the initial phase of a roar, similar to red deer, documented itself acoustically as descending formants (Reby & McComb, 2003a). Rutting roars are short harsh vocalizations and, in contrast to red deer roars, mostly emitted during all modes of locomotion. The fundamental frequency of roars consisted of a low pulse rate of around 22 Hz and that of growls of around 68 Hz, whereas grunts did not involve pulses. To accommodate the pronounced momentary changes of larynx position, the pharynx, including the soft palate, the laryngeal vestibulum and the ligamentous thyrohyoid connection are very elastic. As distinguished from the other ruminant species with a retractable larynx, the increased relative length of the basihyoid probably improves the leverage for the attaching lingual, hyoid and strap muscles, which are involved in laryngeal mobility. The larynx has a conspicuous thyroid bulla and, as the vocal folds are located within this thyroid bulla, their dorsoventral length is enlarged. Laterally, the vocal folds are supported by a voluminous, roughly hemispheric connective tissue pad bulging out the relaxed thyroarytenoid muscle laterally. As in other species possessing such a vocal pad, a mass increase and a decrease of f0 can be inferred. Unique to goitred gazelles, the long cartilaginous spine, extending caudally from the cricoid arch, and its plate bearing on the ventral side of the trachea appear to prevent excessive bending of the rostral end of the trachea at maximal retraction of the larynx. A 2D-model of the interplay of the major components involved in the movements of the larynx in male goitred gazelles suggests a mechanism basically similar to that hypothesized for Mongolian gazelle (Frey et al. 2008a).
We would like to thank the staff of the Ecocenter ‘Djeiran’ (Ecocenter ‘Djeiran’, Bukhara, Republic of Uzbekistan) for their help and support. We are sincerely grateful to Kseniya Efremova for most helpful collaboration in the field and provision of literature. Furthermore, we are deeply grateful to two anonymous reviewers for their constructive criticism and helpful comments. During our work, we adhered to the ‘Guidelines for the treatment of animals in behavioural research and teaching’ (Anim Behav 2006, 71, 245–253) and to the laws of Germany, the Republic of Uzbekistan and the Russian Federation, the countries where the research was conducted. No animal has suffered, and their social structure was not destroyed as a result of the behavioural observations and recordings. This study was supported by the Russian Foundation for Basic Research, grant 09-04-00416 (for I.V. and E.V.).