Testes size, testosterone production and reproductive behaviour in a natural mammalian mating system


Correspondence author. E-mail: brian.preston@u-bourgogne.fr


1. Testosterone (T) is a key mediator in the expression of numerous morphological and behavioural traits in mammals, but the factors underlying individual variation in circulating T levels are poorly understood.

2. The intimate structural integration of sperm and T production within the testes, alongside the dependency of sperm production on high levels of T, suggests that T requirements for spermatogenesis could be an important driver of individual differences in T.

3. To test this hypothesis, we examine how male capacity for sperm production (as indicated by their testes size) is associated with T levels in a feral population of Soay sheep, resident on St. Kilda, Scotland, during their rutting season.

4. We found a strong positive relationship between an individual’s testes size (as measured before their seasonal enlargement) and the levels of circulating T during their rut, suggesting that T requirements for spermatogenesis has a prominent influence on the production of this androgen.

5. In contrast, body condition and competitive ability did not independently predict T levels, findings that are inconsistent with conventional ‘condition-dependent’ and ‘challenge’ hypotheses of T production.

6. This influence of male’s capacity for sperm production on T appeared to be substantial enough to be biologically relevant, as testes size also predicted male aggression and mate-seeking behaviour.

7. Our results suggest that a male’s inherent capacity for sperm and T production is tightly phenotypically integrated, with potential consequences for a wide range of other T-mediated reproductive traits.


Testosterone (T) is ubiquitous among male vertebrates and plays a central role in the expression of numerous sexually selected traits. For example, T controls seasonal changes in musculature (Lincoln 1998; Sheffield-Moore & Urban 2004; Huyghe, Husak et al. 2010), the size of some sexual ornaments (Zuk, Johnsen & Maclarty 1995), the seasonal strengthening of weaponry such as horns and antlers (Lincoln, Guinness & Short 1972; Lincoln 1998; Malo et al. 2009), and is strongly associated with reproductive aggression and sexual behaviour (Lincoln, Guinness & Short 1972; Malmnäs 1977; Wingfield et al. 1990). Given this widespread impact of T on traits associated with reproductive competition (Andersson 1994), individual variation in T production appears likely to be a key factor underpinning differential reproductive success in male vertebrates, and so a clear understanding of how and why males vary in their T levels is of importance.

In vertebrates, around 95% of the body’s T is produced by Leydig cells that are located within the interstitial compartments of the testes, intimately associated with the seminiferous tubules (Thibault, Levasseur & Hunter 1993). This structural integration of sperm and T production is consistent with the highly T-dependent nature of spermatogenesis, as intratesticular levels of T that are some 20–50 times greater than found in the bloodstream are required for the production of viable sperm (McLachlan et al. 1995). Although Leydig cells are only a minor component of testes volume compared with spermatogenic tissue (as little as 1% in domestic rams, Ovis aries; Leal et al. 2004), a number of studies on domestic species (e.g. bulls, Bos taurus; Gábor et al. 1995), and a small number of studies on wild species (e.g. mallards, Anas platyrhynchos, and red deer, Cervus elaphus; Denk & Kempenaers 2006; Malo et al. 2009), have reported significant positive associations between measures of testes size and T titres. As the size of testes directly reflect the quantity of spermatogenic tissue they contain, and so the maximum rate at which they are able to produce sperm (Knight 1977; Raadsma & Edey 1984; Lincoln 1989; Møller 1989), these findings suggest that a male’s inherent capacity for sperm production could be an important driver of T levels in their peripheral blood. Such a relationship would seem likely to have significant consequences for the expression of T-dependent behavioural and morphological traits.

While the abovementioned studies point towards this phenotypic integration of an individual’s capacity for sperm and T production, they must be viewed with caution for several reasons. First, the overwhelming majority of studies available do not account for an alternative theoretical expectation that T production will be dependent on the condition of individuals, with only males that are capable of meeting the energetic requirements of producing ‘costly’ T-mediated sexual ornaments being able to do so (Andersson 1994; Duckworth, Mendonca & Hill 2001; Perez-Rodriguez et al. 2006). Secondly, studies on animals in captivity can be misleading, due in part to a lack of sexual interactions or agonistic ‘challenges’ from competitors that can profoundly increase the levels of T in circulation (Wingfield et al. 1990). In more natural circumstances, T levels may be more heavily influenced by the frequency with which males associate with females (Dloniak, French & Holekamp 2006), or engage in competitive interactions with rival males (Buck & Barnes 2003; Pelletier, Bauman & Festa-Bianchet 2003), and be further affected by whether males ‘win’ any given interaction (the ‘winner effect’Oyegbile & Marler 2005; Oliveira, Silva & Canario 2009). Thus, attributes of males that are of importance in determining their ability to challenge for and fend-off challenges for reproductive access to females, such as larger body size or weaponry, may be more important morphological drivers of male T secretion in natural circumstances. Finally, relationships that have been reported between testes size and T production are typically assumed to arise from the concurrent seasonal activation of both T and sperm production that occurs on the approach to the mating season, which is associated with an increase in testes size (e.g. Lincoln & Davidson 1977; Goeritz et al. 2003). All three responses are triggered by the secretion of gonadotropin releasing hormone (GnRH) from the hypothalamus, by stimulating the anterior pituitary to release follicle stimulating hormone, responsible for activating sperm production and causing the testes to enlarge, and luteinising hormone, which stimulates the production of T (Lincoln 1989; Thibault, Levasseur & Hunter 1993). As the increase in both testes size and T production is ultimately under the control of GnRH secretion, any association between testes size and T production may instead be due to differences in the production of GnRH by the hypothalamus (Thibault, Levasseur & Hunter 1993; Uglem, Mayer & Rosenqvist et al. 2002; Denk & Kempenaers 2006), and so the ‘activation state’ of their testes, rather than being an intrinsic characteristic of the testes themselves.

The equivocal state of the evidence regarding the influence of male capacity for sperm production on T secretion is reflected in the conclusions of recent intraspecific and comparative investigations that report positive correlations between testes size and T titres, with some authors inferring that the link is a consequence of the phenotypic integration of spermatogenic and endocrine functions of the testes (Garamszegi, Eens et al. 2005; Malo, Roldan et al. 2009), while most question or reject causality (Emerson 1997; Uglem, Mayer & Rosenqvist 2002; Dixson & Anderson 2004; Denk & Kempenaers 2006).

In this report, we examine the morphological correlates of T secretion and behaviour in promiscuously mating feral Soay sheep, which have ranged freely on the isolated archipelago of St. Kilda, Scotland, for at least a millennium (Clutton-Brock & Pemberton 2004). Physical contests between rams for access to females during the mating season (rut) can be fierce, during which males often butt the flanks of rivals or engage in head-on clashes (Preston et al. 2001). Physical size is important in these contests, as males of large body and horn size are better able to temporarily monopolize reproductive access to receptive females (Preston et al. 2001, 2003). Soay ewes are highly promiscuous, however, copulating with up to 10 different consort males in a day (Stevenson, Marrow et al. 2004; Preston et al. 2005). As a consequence, when females produce twins, they are usually sired by different males (Pemberton, Coltman et al. 1999). Under these conditions of intense sperm competition (Parker 1970), high sperm production rates are thought to be advantageous, as the probability of ‘winning’ paternities is proportional to the number of sperm each competitors inseminates – the ‘lottery effect’ (Beatty 1960; Parker 1982; Dziuk 1996). In accordance with this theory, Soay rams have evolved testes that are approximately four times heavier than would be predicted for their body mass (Stevenson 1994). Furthermore, males possessing larger testes, and higher sperm production rates (Knight 1977; Raadsma & Edey 1984; Lincoln 1989), are able to gain a greater share of paternities towards the peak of the rut, when physical competition for access to females is reduced (Preston et al. 2003).

We tested the hypothesis that an individual’s capacity for sperm and T production is phenotypically integrated and the potential influence of this integration by: (i) assessing the relative importance of testes size, body condition and competitive ability in determining rutting males T levels in this free-living population, (ii) examining temporally disparate measurements of testes size and T levels for their concordance, thereby removing the potential for hypothalamic production of GnRH (and thus the activation state of the testes) from driving any relationship between sperm and T production, and (iii) investigating whether any phenotype-dependent variation in T levels was biologically relevant, by examining whether the effect of attributes that could influence T levels, and specifically testes size, was substantial enough to affect levels of aggression and sexual behaviour.

Materials and methods

Soay Sheep and Study Area

A free-ranging population of feral Soay sheep has been resident on the St. Kilda archipelago (57°49′N, 08°34′W) for over a millennium. This investigation focuses on animals that range within the Village Bay area of Hirta, within which around 95% of lambs have been tagged soon after birth since 1985, and are thus of known age. The size of the Hirta population fluctuates between 600 and 2000 sheep, and has a female-biased sex ratio (males/females) that ranges from 0·27 to 0·7. A detailed description of the study area and population can be found elsewhere (Clutton-Brock & Pemberton 2004).

Morphometric Data

Morphometric variables used in the analysis were collected in a 2-week period in August, when a large proportion of the study area population was rounded up, with additional measurements from animals captured during the rutting period. Body mass was measured using a carry net and drop scales. Hind leg length is measured from the tuber calcis of the fibular tarsal bone to the distal end of the metatarsus and is taken here to be an indicator of skeletal size. Horn length is measured from the base, along the outer curvature of the spiral, to the tip. Scrotal circumference was measured at the widest point of the scrotum and directly reflects both testes mass and their maximum capacity for sperm production (Knight 1977; Raadsma & Edey 1984; Lincoln 1989). All measurements were taken to either the nearest mm or 0·1 kg.

Behavioural Observations

Behavioural focal watches were recorded during the rutting season from late October to early December (from 1995 to 2003), beginning well before the first females entered oestrus at the start of November. All our behavioural data come from individually-identifiable adult males (≥2 year of age) that were watched repeatedly throughout the rut. Each watch lasted a median of 1 h, producing 15·2 ± 5·6 (mean ± SD) hrs of behavioural data per male. Psion data loggers with customized software (Sunadal Data Solutions) were used to record male behaviour continuously during focal watches. Behaviours of relevance to this study were the amount of time males spent in aggressive interactions with other males and the time spent in sexual behaviours. Aggressive behaviour in Soay sheep ranges from ‘tongue flicks’, which are associated with low-intensity kicks aimed at rivals, to serious fights, in which males engage in violent head-on clashes (Grubb 1974). Sexual behaviour includes the time males spend searching for oestrous females in addition to the time they spend courting and copulating with them. Mate-seeking behaviour was identified as periods of continuous ‘head-up’ travelling around the island. This is easily differentiated from the movement observed while males graze, which is instead characterized by continuous feeding that is occasionally punctuated with a few seconds of head-up movement. Each day the study area was searched for consort pairs, providing an indicator of the numbers of females in oestrus. The adult operational sex ratio is always male biased, meaning many males are always available to identify females in oestrus (range 4–236 males per oestrous female; Preston et al. 2001).

Testosterone Assay

We collected blood samples from rams of all ages (range: 0–9 years) captured during the rutting season in 1997–2003. Blood was collected by jugular venepuncture using lithium heparin-coated vacuettes, and the plasma was separated and stored at −20 °C within 24 h. Testosterone concentration in plasma samples was measured using a routine radioimmunoassay (Corker & Davidson 1978), as modified for an iodinated tracer (Sharpe & Bartlett 1985). The minimum limit of detection was 0·4 ng mL−1, and the intra- and inter-assay coefficients of variation were <10%.

Statistical Analysis

Behavioural and T data were analysed using linear mixed-effects models in Genstat for Windows eighth edition (Genstat 5 Committee 1993). These models control for the repeated measures on individuals by fitting male identity as a random effect. The minimal model was determined via backward deletion (McCullagh & Nelder 1983), with significance set at P ≤ 0·05. Data underwent appropriate transformation prior to analysis to meet assumptions of normality (see table legends for specific details). For each analysis, age, body size (estimated by hind leg length), body mass, horn length, scrotal circumference, number of females in oestrus and stage of the rut (from the day the first oestrous female was observed) were fitted as explanatory variables. A quadratic term was fitted for age and stage of rut to test for a potentially curvilinear relationship with T and/or rutting behaviour. Each male characteristic was fitted in interaction with stage of rut and the number of oestrous females in the study area, to test for temporal and sociosexual influences on their importance. ‘Year’ was fitted as categorical random effect. The exclusion probabilities provided were obtained by fitting each previously deleted explanatory term last in the minimal model. Specific details and further explanation of analyses are contained within table or figure legends as appropriate.


Testosterone Production

T levels in rams were already high at the onset of the rut, having peaked before the first females were observed in oestrus in the study area, and then declined through the course of the rutting season (Fig. 1a; Table 1). After controlling for these seasonal changes in T levels, the number of oestrous females within the study area was also associated with small increases in the secretion of T (Table 1). After accounting for this temporal and sociosexual variation in T secretion in our model, we examined a suite of male characteristics in this analysis for their correspondence with naturally circulating plasma T levels. This model indicated that T production changes through the lifetime of rams, with T levels increasing in males from birth until they reach full sexual maturity at 4–6 years of age, and declining thereafter (quadratic term in Table 1). The model suggests that, all else being equal, rams in their first rut would have peak T levels that were 76% of those expected for fully-grown rams (estimate for a first-year male = 21·4 ng mL−1, for a 5-year-old = 28·5 ng mL−1). In addition to age-related changes in T production, male testes size during the rut was also found to have a strong positive association with plasma T levels (Table 1; Fig. 1b). At peak levels, the difference in T secretion between males with small (10th percentile) and large (90th percentile) testes was estimated by the model to be 23 ng mL−1, in excess of a two-fold difference (Fig. 1c). No other male traits tested in our model (male horn length, body size and body mass) appeared to have a significant association with T levels in our analyses (P > 0·34).

Figure 1.

 Testes size and testosterone levels during the rutting period. The plots show (a) the temporal distribution of male T levels (logx+1 transformed) across the rutting period (b) male T levels (logx+1 transformed) in relation to their scrotal circumference and (c) temporal variation in T levels across the rutting period (solid lines) as predicted for males possessing small (10th percentile of males in this sample), medium (50th percentile) or large (90th percentile) testes. All plots control for other terms remaining in the minimal model (see Table 1). The temporal plots include the influence of the average numbers of females in oestrus on each day (as determined by the model), which is represented by the shaded area.

Table 1.   Linear mixed-model analysis of male testosterone levels in the rut. The aim of the analyses was to determine which male characteristics influenced their plasma testosterone levels (logx+1 transformed). Only 15% of the available data were repeated measurements, and so a single measure for each male was picked at random and included in the analysis; nmales = 125, constant = −0·7392. Year was controlled for as a random factor. All other characteristics tested in this model were excluded with a P > 0·34, when fitted last in the model. See the Materials and methods for further details of the analysis
Termd.f.EffectSEWald stat (χ2)P-value
Day of rut1−0·02980·01245·780·016
Day of rut21−0·001270·0004159·920·002
Number of oestrous females10·02380·01055·120·024
Scrotal circumference10·01290·0031416·87<0·001

The size of mammalian testes during the mating period is a product of their potential capacity for sperm production and their degree of seasonal activation. In accordance with this, Soay ram testes increased by an average of 18% on the approach to the rut, although remained highly correlated with their size prior to the onset on the rutting season (r = 0·761, n = 40, P < 0·001). To estimate the influence of these two components of testes size in predicting male T levels, we examined a simplified model that incorporated their scrotal circumference 76 ± 4 days (mean ± SD) prior to onset of the rut (when T levels would be near their minima; Lincoln & Davidson 1977), alongside their subsequent growth. We found that both components of testes size showed strong and independent positive relationships with male T secretion during the rutting period (Linear mixed model; nmales = 40; pre-rut testes size: coefficient = 0·0344, SE = 0·00371, d.f. = 1, Wald statistic = 86·05, P < 0·001; growth: coefficient = 0·0370, SE = 0·00523, d.f. = 1, Wald statistic = 49·88, P < 0·001; controlling for year as a random effect; Fig. 2a,b). Importantly, scrotal measurements obtained prior to the rut remained as highly significant predictors of T levels when fitted alone as a fixed term in this model (Wald statistic = 15·15, P < 0·001), while estimates of their subsequent growth did not (Wald statistic = 0·02, P > 0·8). Together, these analyses suggest that changes in testes size that are associated with the seasonal rise in FSH contribute to observed relationships between testes size and T levels, but are secondary to the influence of male’s inherent capacity for sperm production.

Figure 2.

 Components of testes size and T levels. The association between the size of male testes and their T levels as measured by (a) their scrotal circumference 11 weeks prior to the rut and (b) the subsequent seasonal growth that occurred on the approach to the rutting period. Each plot controls for the influence of the other component of testes size, as estimated by a linear mixed model (see the main text).

Reproductive Behaviour

Next, we assessed whether this apparent influence of testes size on male T production was also associated with their behaviour during the rut. We first examined levels of aggression, which increased substantially in the 2 weeks prior to the rut and peaked shortly after the first females entered oestrus (Table 2a; Fig. S1a). While controlling for these temporal patterns of aggression, our behavioural model is consistent with the notion that male condition mediated the expression of aggression to some extent, as males in poorer condition tended to be involved in competitive interactions only when receptive females were available (Table 2a; body mass : week of rut interactions). Age-related patterns of testosterone secretion were not matched by corresponding patterns of agonistic behaviour, except for young males (2 years of age), which spent significantly less time engaged in aggressive interactions when compared with males of all other ages (Table 2a). After accounting for these sources of variation in competitive behaviour, male testes size (measured 11 weeks prior to the rut) was the strongest predictor of their aggression (Table 2a; Fig. 3a). This model estimates that males with large testes spent almost 2 h a day in aggressive exchanges during the first week of the rut, more than twice the amount of time that males with small testes devoted to aggressive interactions (Fig. 3b). In contrast to the influence of male’s testes size, their competitive ability (as estimated by their horn length and/or body size; Preston et al. 2003) did not appear to influence their levels of aggression (P > 0·28).

Table 2.   Linear mixed model analyses of male reproductive behaviour. The aim of the analyses was to determine which male characteristics influenced their reproductive behaviour. We divided reproductive behaviour into (a) male aggression in each week of the rut; the response variable is the percentage of total watch time (logx+1 transformed) for which focal males were engaged in agonistic activity in each week, constant = −0·821, and (b) male mate-seeking behaviour in each week of the rut; the response variable is the percentage of time (sqrtx+1 transformed) for which males moved around the study area investigating females for signs of oestrus in each week, constant = 0·206. All other characteristics tested in these models were excluded with a P > 0·2 when fitted last in the model. To preclude any confounding influence of mate-guarding behaviour on our analyses, response variables were restricted to periods in which males were not guarding oestrous females and to males with at least 1 h of behavioural data available in each week: nwatches = 1193, nweeks = 419, nmales = 57. Year was controlled for as a random effect. See Materials and methods for further details of analyses
Termd.f.EffectSEWald stat (χ2)P-value
  1. aModel inspection indicated the effect of age was due only to low aggression of 2-year-old males, and so aggression was remodelled fitting age as a categorical factor (2 or 3 and above); terms remaining in the minimal model did not change.

  2. bMean effect.

(a) Minimal model of ♂ aggression
 Week of rut10·6960·2229·880·002
 Week of rut21−0·2890·061821·88<0·001
 Body mass1−0·01530·01391·220·270
 Body mass: week of rut1−0·01630·006406·470·011
 Body mass: week of rut210·005590·001829·470·002
 Scrotal circumference10·01320·0028521·42<0·001
(b) Minimal model of ♂ mate-seeking behaviour
 Week of rut1−1·4750·6854·640·031
 Week of rut210·3990·2073·700·055
 Number of oestrous females10·1650·017192·32<0·001
 Scrotal circumference10·01080·004276·460·011
 Scrotal circumference: week of rut10·007900·002529·810·002
 Scrotal circumference: week of rut21−0·001990·0007556·950·008
Figure 3.

 Testes size and male aggression during the rutting period. The plots illustrate (a) the percentage of time during focal watches (logx+1 transformed) that males spent engaged in aggressive exchanges as a function of their pre-rutting period scrotal circumference (while controlling for the other influential variables retained in the statistical model; see Table 2a). For illustrative purposes, data are grouped by individual within years (n = 92); each data point represents 12·3 h of behavioural data on average. (b) The temporal distribution of aggressive behaviour (solid lines) as predicted by this model for a 5-year-old male possessing small (10th percentile of males in this sample), medium (50th percentile) or large (90th percentile) testes. The shaded area in the plot represents the average daily number of females in oestrus during each week.

As females began to enter oestrus, rams engaged in fewer aggressive exchanges and spent increasing time travelling between groups of females testing for the onset of oestrus (Table 2b, Fig. S1b). Whilst male age had a relatively minor influence on their levels of aggression, we detected a positive linear influence of age on the amount of time males devoted to searching for mates (Table 2b). Consistent both with their role in testosterone production and so their association with aggression, our results also showed that males with larger testes spent significantly greater periods of time searching for receptive females (coefficient = 0·0102, SE = 0·00355, Wald statistic = 8·28, d.f. = 1, P = 0·008; Fig. 4a). This positive association between testes size and mate-seeking behaviour became stronger in the mid to late period of the rut, as the levels of aggression declined further (Fig. 4b; Table 2b, scrotal circumference: week of rut interaction). As with their aggressive behaviour, this model estimates that males with large testes spent an additional hour a day searching for mates when compared with rivals possessing smaller testes (when estimated at the peak of the rut).

Figure 4.

 Testes size and mate-seeking behaviour during the rut. (a) The percentage of time that males spent searching for mates during focal watches (sqrtx+1 transformed) as a function of their pre-rutting period scrotal circumference. This plot controls for other influential variables identified by the model in Table 2b. For illustrative purposes, data are grouped by individual within years (n = 92); each data point represents 12·3 h of behavioural data on average. (b) The temporal distribution of mate-seeking behaviour (solid lines) as predicted by this model for a 5-year-old male possessing small (10th percentile of males in this sample), medium (50th percentile) or large (90th percentile) testes. The shaded area in the plot denotes the average daily number of females that were in oestrus each week.


Although tempered by age-specific processes, our results show that variation in testes size is a strong predictor of male T titres during the mating season. This relationship appears, in part, to be driven by an enlargement of their testes, which is induced by a seasonal increase in FSH production, correlated with a similar increase in T production and caused by a shared response to GnRH secretion (Lincoln 1989; see also Malo et al. 2009). Thus, concerns that previously reported associations between an individual’s testes size and T production will reflect their degree of seasonal activation that appear to have a legitimate basis (Uglem, Mayer & Rosenqvist 2002; Denk & Kempenaers 2006), at least in intraspecific analyses. However, the testes of males eleven weeks prior to the rutting period, when they are at the onset of a 5-fold seasonal increase in T and FSH production (Lincoln & Davidson 1977), appeared to have a much greater influence in shaping plasma T levels within the rut. Given the dramatic changes in GnRH secretion that occur in subsequent months, these temporally isolated measures of scrotal circumference seem unlikely to be better related to rut T levels under hypotheses that invoke their shared responsiveness to GnRH. Instead, this relationship is consistent with the suggestion that testes with greater amounts of spermatogenic tissue require, and are supplied with, higher levels of T, that is, these two functions of the testes appear to be phenotypically integrated.

In contrast to the apparent influence of the size of male’s testes on their testosterone production, there was no evidence that their competitive ability or body condition affected their T levels. These latter results run counter to predictions whether T levels are principally shaped by challenges for reproductive access to females (Wingfield et al. 1990), or are a condition-dependent attribute of males (Andersson 1994). It is possible that an influence of a more competitive phenotype on T levels could be missed through the use of blood plasma assays, which provide only a ‘snapshot’ of male hormone levels at a single point in time. A more comprehensive picture could be revealed through the use of non-invasive hormone monitoring (e.g. faecal hormone measures), which instead provides a compound measure of male hormone profiles over a longer period (Whitten, Brockman & Stavisky 1998). However, it may also be that larger testes allow males to respond more rapidly and/or for longer intervals following ‘challenge-induced’ GnRH secretion, and thus play an important role in any T response to agonistic challenges.

Our analyses on the behaviour of free-living Soay rams suggest that the T-dependent nature of spermatogenesis has a substantial influence on their reproductive activities. Rams with larger testes in the pre-rutting period displayed heightened levels of aggression and mate-seeking behaviour within the rut. These results complement previous findings on this system, which revealed that males with larger testes also copulated at higher rates, while in consorts with oestrous females (Preston et al. 2003). As with our analyses of T levels, it is notable that neither body nor horn size – attributes that determine a male’s ability to monopolize females – affected aggressive or mate-seeking behaviour (Preston et al. 2001, 2003).

The testosterone-mediated link between sperm production, higher levels of mate-seeking behaviour and increased copulation rates would appear synergistic, as it would provide an endogenous mechanism by which males could regulate their copulation rate according to the sperm supply that they have available to them. In accordance with this suggestion, interspecific analyses on primates and rodents also show that the rate and frequency of ejaculation increase with testes size (Dixson & Anderson 2004; Stockley & Preston 2004), while intraspecific studies suggest such patterns are T-dependent (e.g. in rats; Malmnäs 1977). In Soay sheep, males with large testes increase their share of paternities when receptive females are in abundance (Preston et al. 2003), presumably because they are both able to locate females rapidly and inseminate more sperm once a receptive female has been found.

In a similar way, higher levels of aggression associated with larger testes might also be adaptive by increasing male’s ability to out-compete rival males in physical contests (Garamszegi et al. 2005). Surprisingly, however, previous analyses of Soay sheep behaviour have failed to detect any net advantage of larger testes in monopolizing access to females (Preston et al. 2003). This is a striking departure from expectations given that aggressive interactions are likely to be costly in terms of time, energy and injury; 60% of males suffer fractures to their cervical vertebrae as a result of fights (Clutton-Brock, Dennis-Bryan & Armitage 1990). Any benefits of heightened levels of aggression shown here would be expected to be readily apparent in the face of these costs. An intriguing alternative possibility is that selection for increased sperm production has driven levels of aggression beyond their optima, to a point where it may constrain further increases in sperm production. Such T-mediated trade-offs between behavioural activities appear to exist in several avian species, where T-induced elevation of reproductive activity comes at the price of reduced parental care (Ketterson & Nolan 1999).

The consequences of spermatogenic activity driving T production may extend beyond the behaviour of males and influence T-dependent morphological traits. Direct associations between testes size and musculature have been noted in three species of small mammal (bushy-tailed wood rat, Neotoma cinerea, deer mouse, Peromyscus maniculatus, and red-backed vote, Clethrionomys gapperi; Schulte-Hostedde, Millar & Hickling 2003), and testes mass is also correlated with the size of ornamental combs in the domestic fowl, Gallus gallus (Pizzari, Jensen & Cornwallis 2004). Indeed, a host of other T-dependent traits implicated in mating competition could potentially be affected by a functional dependence of sperm production on T (from male colouration to their weaponry; Lincoln 1994; Waitt et al. 2003; Malo et al. 2009; Karubian et al. 2011).

Our results provide good support for the suggestion that females could, on average, assess the fertility of prospective mating partners through their ‘mating vigour’ (Trivers 1972). Males with the highest reproductive drives will also have the highest sperm production rates (Raadsma & Edey 1984) and may also produce higher quality sperm (McLachlan et al. 1995; Malo et al. 2009; but see Pizzari, Jensen & Cornwallis 2004). Females could use these behavioural cues to gain direct fertility benefits (although higher sperm depletion rates of behaviourally successful males may make such cues unreliable) (Trivers 1972; Preston et al. 2001) or indirect benefits by producing offspring that are likely to experience greater success under conditions of sperm competition (Sheldon 1994).

As a major evolutionary driver of sperm production rates is the occurrence of sperm competition (Parker 1970; Birkhead & Møller 1998), heightened levels of female promiscuity could produce multiplicative effects across male T-dependent traits. There is comparative evidence that is consistent with sperm competition driving T production in an evolutionary context. Soay rams have evolved testes that is around 58% heavier relative to their body mass than ancestral mouflon rams (Lincoln 1989) and peak T levels that are 65% higher (Lincoln, Lincoln & McNeilly 1990). On a broader phylogenetic scale, recent comparative analyses in frog, avian and primate taxa have shown that when species evolve with larger testes, they also increase their T levels (Emerson 1997; Dixson & Anderson 2004; Garamszegi et al. 2005). While the consequences of elevated T are yet to be explored in a phylogenetic framework, these studies suggest that the impact of sperm competition may be more direct and far-reaching than is currently considered.

In conclusion, our results suggest that a male’s inherent capacity for sperm production has a strong influence on testosterone levels in their peripheral blood. This source of variation in testosterone levels appears to have a substantial affect on reproductive behaviour, as males with larger testes spend more time both seeking receptive females and engaging in aggressive interactions with rival males. The integration of sperm and T production would allow males to regulate their copulation rates according to their supply of sperm and could potentially play a role in signalling the production of competitive ejaculates to females. However, it also has the potential to affect multiple reproductive traits that have evolved under pre-copulatory modes of sexual selection and shift them beyond their optimal expression.


We thank The National Trust for Scotland and Scottish Natural Heritage for permission to undertake work on St. Kilda. QinetiQ plc (and previously the Royal Artillery Range, Hebrides) provided essential logistical support. Many thanks to other members of the project, especially Josephine Pemberton, Tim Clutton-Brock, Steve Albon and Bryan Grenfell. We gratefully acknowledge the help of the many volunteers that assisted data collection efforts, particularly Ali Donald, Rusty Hooper, Owen Jones and Gill Telford. Many thanks also to Paula Stockley for seminal discussions and to Tobias Deschner and Charlie Nunn for valuable comments on the manuscript. This work was funded by the Royal Society, Association for the Study of Animal Behaviour, the Wellcome Trust and the Natural Environment Research Council.