Males and females differ in their reproductive roles and as a consequence are often under diverging selection pressures on shared phenotypic traits. Theory predicts that divergent selection can favor the invasion of sexually antagonistic alleles, which increase the fitness of one sex at the detriment of the other. Sexual antagonism can be subsequently resolved through the evolution of sex-specific gene expression, allowing the sexes to diverge phenotypically. Although sexual dimorphism is very common, recent evidence also shows that antagonistic genetic variation continues to segregate in populations of many organisms. Here we present empirical data on the interaction between sexual antagonism and genetic drift in populations that have independently evolved under standardized conditions. We demonstrate that small experimental populations of Drosophila melanogaster have diverged in male and female fitness, with some populations showing high male, but low female fitness while other populations show the reverse pattern. The between-population patterns are consistent with the differentiation in reproductive fitness being driven by genetic drift in sexually antagonistic alleles. We discuss the implications of our results with respect to the maintenance of antagonistic variation in subdivided populations and consider the wider implications of drift in fitness-related genes.

By definition, males and females of any species with separate sexes differ phenotypically. This sexual dimorphism varies in degree and reflects divergent selection on male and female morphological, physiological, and behavioral characters. The male phenotype will often be significantly shaped by sexual selection on traits that increase fertilization success whereas the female phenotype will reflect selection for increased fecundity (Andersson 1994).

Although phenotypic differences between males and females are ubiquitous and sometimes striking, a growing body of work shows that the evolution of sexual dimorphism is often incomplete. A number of studies published over the past years have shown that “sexual antagonism” (also called “intra-locus sexual conflict”; Rice and Chippindale 2001; Bonduriansky and Chenoweth 2009) persists in many populations (e.g., Chippindale et al. 2001; Foerster et al. 2007; Mainguy et al. 2009; Svensson et al. 2009; Delph et al. 2011). Sexual antagonism arises when alleles that benefit the fitness of one sex cause deleterious effects in the other sex. Sexually antagonistic alleles are viewed as a first evolutionary step toward the evolution of sexual dimorphism. Theory predicts that divergent selection on male and female phenotypes can favor the invasion of antagonistic mutations, as long as their benefit in one sex outweighs their cost in the other (Rice 1984). Once antagonistic variation segregates in the population, selection will favor the invasion of modifiers that limit the expression of antagonistic alleles to the favored sex, thus resolving antagonism and allowing the sexes to diverge phenotypically (Lande 1980; Rice 1984).

Population genetic models not only predict the invasion of sexually antagonistic mutations but also indicate that such alleles can be maintained in stable polymorphism (Kidwell et al. 1977; Gavrilets and Rice 2006; Fry 2010). The degree to which such polymorphism will persist depends on the dynamics of resolution. This could potentially be slow if it relies on the occurrence of rare events such as gene duplication (Connallon and Clark 2011) and could indeed be hampered by deleterious pleiotropic effects of sex-specific gene expression (Mank et al. 2008). Irrespective of the exact dynamics of its resolution, the fact that antagonistic variation has been found in a number of animal and plant populations indicates that detectable levels of such variation persist in many populations. As such, sexual antagonism is one of the major forces that potentially contribute to the maintenance of genetic variation for fitness (Chippindale et al. 2001; Bonduriansky and Chenoweth 2009; Patten et al. 2010; Connallon and Clark 2012).

When considering the role of antagonism in maintaining fitness variation it is important to realize that most existing theory of antagonism uses deterministic models that effectively assume populations of infinite size (e.g., Rice 1984; Gavrilets and Rice 2006). Similarly, the empirical studies demonstrating the existence of antagonistic variation are based on studies in large, outbred populations comprising many hundreds or thousands of individuals (the fruit fly population used in studies by Rice and co-workers (e.g., Chippindale et al. 2001), for example, is maintained at a population size of about 1800 breeding adults). This means that antagonism has so far only been investigated under conditions where the evolutionary dynamics are dominated by the selective forces generated by antagonistic fitness effects. In contrast, the role of genetic drift has hitherto been ignored in empirical studies. Random changes in allele frequency are expected to affect the evolution of any phenotypic trait in small or subdivided populations. However, drift is likely to play a particularly prominent role in the evolution of sexually antagonistic traits (Connallon and Clark 2012). Due to their opposing effects on the fitness of males and females, the net selection pressure on sexually antagonistic mutations is often weak, even when the sex-specific effects are large. Consequently, the force of selection acting on antagonistic mutations is easily overcome by genetic drift when effective population sizes are small (Connallon and Clark 2012). In this case, random changes in the frequency of antagonistic alleles could have a significant impact on the sex-specific fitness of populations and cause differentiation in fitness between populations that would be highly unlikely to occur under sexually concordant selection. By affecting the overall productivity of populations, genetic drift in antagonistic variation could have significant consequences for the long-term viability and survival of populations.

Here we describe experimental results on the effect of genetic drift on sex-specific fitness in four replicate populations of the fruit fly Drosophila melanogaster that had evolved independently at a small population size. We measured male and female larval and adult fitness of replicate genotypes sampled from each of these populations. We found that populations had significantly diverged in average fitness. In the larval stage, this divergence was independent of sex. In adults, fitness falls along a cline ranging from high male/low female to low male/high female fitness. We argue that the observed patterns of fitness variation between populations are consistent with differentiation in reproductive fitness of the sexes being driven by genetic drift in sexually antagonistic alleles. We discuss the implications of our results for antagonistic variation in the base stock from which the populations were derived and more generally for the evolution of sexually antagonistic alleles in subdivided populations.

Materials and Methods


We used four experimental populations in this study. These represent a subset of experimental populations originally established as part of a larger study (Reuter et al. 2008, 1:1 lines). The four populations were derived from the outbred and laboratory-adapted Dahomey wild-type stock and each was founded by 50 virgin males and 50 virgin females. Subsequently, populations were maintained independently under a standardized rearing regime as described in Reuter et al. (2008). Briefly, larvae were reared in culture bottles at constant densities (300 larvae per 65 mL of food), eclosing adults were collected as virgins and new adult populations of 50 males and 50 females established in cages supplied with food (yeast paste) and oviposition media. Adult populations were allowed to interact and mate over a period of 4 days. Eggs for the subsequent generation were collected over the last 24 h of the interaction period. All fly cultures were maintained at 21°C throughout the life cycle.

The populations were established from the large and genetically diverse Dahomey stock. Due to their small numerical size and the imposition of a rearing regime with discrete generations, the replicate populations are expected to undergo increased levels of drift. Based on standard population genetic models (Crow and Kimura 1970, p. 350), the effective size of populations with 50 males and 50 females is Ne = 100. However, this calculation ignores the fact that the number of matings is finite. Using a more refined model (Balloux and Lehmann 2003, Eq. 7) and empirical estimates of the frequency of double matings in the experimental conditions under which the lines evolved (Table 1 in Reuter et al. 2008), the effective size of the populations is predicted to be in the order of Ne ≈ 80.

Table 1. Axis loadings of a principal component analysis of larval and adult relative fitness in males and females. The data entries in rows 1–4 of the table specify the weighting of each of the four fitness components in each of the four PC axes. The values in rows 5 and 6 provide the percentage of variance in the data captured by each of the axes and the P values of one-way ANOVAs testing the difference between populations in scores on each of the four axes, respectively
Male larval fitness+0.19−0.05+0.47+0.86
Male adult fitness−0.92+0.14−0.18+0.31
Female larval fitness+0.16−0.60−0.72+0.32
Female adult fitness+0.29+0.79−0.48+0.25
Variance captured35%27%23%15%
P value<0.00010.0010.910.18

When analyzed here, the populations had undergone approximately 80 generations of experimental evolution under the conditions described above. We then measured male and female larval and adult fitness of 81 haploid genomes (hemiclones) sampled randomly from the four populations (18–22 from each population).


Hemiclones were sampled and multiplied using “cytogenetic cloning” (see Rice 1996; Abbott and Morrow 2011 for descriptions of approach and Fig. S1 for a schematic representation of the crossing scheme used here). Haploid genomes were extracted from the populations by mating randomly sampled males to females of the “Clone Generator” (CG) stock and backcrossing single male offspring once again to females of the CG stock. The genotype of CG females (compound X, Y chromosome, homozygous viable translocation of chromosomes II and III, see Rice 1996) ensures paternal transmission of the X chromosome to sons and co-segregation of the paternal second and third chromosomes, making it possible to produce many males carrying an identical set of randomly sampled X, II, and III (the fourth chromosome, which carries only about 0.5% of the coding genes in D. melanogaster, is ignored here for pragmatic reasons). The haploid complement of X, II, and III chromosomes is hereafter referred to as a target genome (TG) and, due to the absence of male recombination in Drosophila, can be maintained and multiplied by crossing their male carriers to CG females.


To assay their effect on male and female fitness, TGs were expressed in an outbred genetic background in males and females, complemented with genomes randomly sampled from their population of origin. To express TGs in a female background, males carrying a TG in a CG background were crossed with multiple virgin females of their corresponding population of origin (Fig. S1C). Half of the females produced in this cross inherited an identical TG from the father (the other half received the target X together with the eye-color marked translocation of chromosomes II and III), complemented with different maternal genomes. To express TGs in males, males carrying a TG in clone generator background were mated to multiple females of a stock carrying a compound X chromosome [C(1)DX], a Y chromosome and autosomes of the TGs population of origin. The compound X of the females ensured paternal transmission of the X chromosome. Accordingly, half of the emerging males produced in this cross inherited an identical TG from the father (with others receiving the target X and the II-III translocation), complemented with different Y chromosomes and autosomes contributed by the mothers.


We measured larval fitness of TGs as survival under conditions comparable to the rearing regime under which the flies had evolved. For this purpose, the crosses to express TGs in males and females (described above) were set up in small cages supplied with grape juice plates for egg laying. Eggs laid on the grape juice plates were incubated until first instar larvae hatched. In parallel, cultures were similarly maintained of a competitor strain marked with a recessive eye color mutant, sparkling poliert in an outbred genetic background sampled from the four experimental evolution lines. Cultures of competing larvae were then set up for each TG by transferring 150 larvae from the TG expression cross and 150 eye-color marked competitor larvae to a 190 mL food bottle containing 65 mL of media. Flies eclosing from the cultures were counted under cold anesthesia. Larval fitness was calculated as the proportion of wild-type flies of the sex under investigation, minus the expected proportion of 1/8 (this was the expectation because half of the larvae transferred were descendants of the expression cross, of which half were of the desired sex, again half of which were of the desired genotype).


Male adult fitness was measured as fertilization success under competitive conditions that were similar to the populations’ rearing regime. For each fitness assay, we set up a cage containing a target of 15 males sharing a particular TG, 35 eye-color marked competitor males and 50 eye-color marked females. In a small number of cases (∼10% of genomes) fewer than 15 TG males were available for a TG. In these cases we added a further complement of competitor males to attain a total of 50 males. All flies used were virgin and, in effect, between 1 and 3.5 days of age (they had matured between 1 and 5 days at 18°C). In line with the conditions of the rearing regime, flies were allowed to interact in the cages for 4 days and cages were supplied daily with fresh grape juice plates (as oviposition media) and ad libitum yeast paste. After the end of day 4, males were discarded and females were placed individually into yeasted vials to lay eggs for a further 3 days. Females were then discarded and their progeny left to develop. Upon eclosion, progeny were scored for eye color and counted. The mating success of target males (probability of a female mating with a wild type rather than eye-color mutant male) was estimated from the scores obtained from the 50 females of an assay, using a Bayesian procedure described in the Appendix of Reuter et al. (2008). This estimation takes into account the fact that different numbers of matings with males of the same phenotype produce batches of offspring all of the same eye color. This mating probability was divided by the expected probability under random mating (number of TG males in the cage/50) to obtain a measure of male adult fitness.


Female adult fitness was measured as egg laying rate under competitive conditions similar to those of the rearing regime under which the populations evolved. For each fitness assay, a cage with a grape juice egg laying plate and yeast paste was set up, containing a target of 15 females sharing a particular TG, 35 eye-color marked competitor females and 50 eye-color marked males. Again, in rare cases (<5% of genomes) fewer than 15 TG females were available and numbers were boosted with a further complement of competitor females to attain a total of 50 females. All flies used were virgin and between, in effect, 1 and 3.5 days of age (1–5 days at 18°C). Flies were allowed to interact for 3 days. At the end of the third day, females were isolated in individual vials and allowed to lay eggs for one day (equivalent to the last 24 h of the rearing cycle). After this period, females were discarded and their progeny left to develop. Upon eclosion, offspring were counted. The average fertility of wild-type females in the cage, divided by the average fertility of the eye-color marked competitors in the cage, was used as the female adult fitness measure.


We calculated relative fitness of the genomes for each sex and life stage (larva, adult) separately by dividing individual fitness values by the average fitness across all populations. Total fitness values of individual genomes were calculated for each sex separately by multiplying the relative, sex-specific values of larval and adult fitness.


To ensure sufficient quality of our dataset we removed TGs for which we deemed fitness data unreliable. Thus, we removed from the analysis one TG for which fewer than 140 flies eclosed in one of the larval fitness assays, six TGs for which fewer than 10 adult TG males or females entered the fitness assay, and two further TGs that presented outlier values for one of the fitness measures (defined here as differing by more than 2.5 standard deviations from the population average).

We analyzed the data of our experiments using standard parametric statistics in R (R Development Core Team 2006). In analyses of variance, population was modeled as a fixed effect. Although differences between populations would potentially be more appropriately represented as a random variable, the low number of populations analyzed (four) did not allow for a reliable estimation of between-population variances and covariances (Crawley 2002, p. 670). Principal component analyses (PCAs) were performed based on covariances.

For all analyses, we verified that the distribution of the data matched the assumption of the tests used. Where required, we transformed the data and indicate so when reporting the results. We also confirmed that despite being based on the same individuals, our measures of larval and adult fitness were independent. It is conceivable that genomes with high larval fitness would experience greater larval competition and accordingly show lowered adult fitness. Due to the large excess of larval growth media used here, such effects are unlikely and we can formally rule them out because adult fitness in both sexes was uncorrelated with the total number of flies eclosing from the larval growth cultures (Pearson's product moment correlation; males: r = 0.16, t75 = 1.36, P = 0.18; females: r = 0.08, t75 = 0.67, P = 0.51).


We obtained estimates for the four fitness components (male and female larval and adult fitness) for a total of 77 and an average of 19.3 TGs per population (i.e., 17, 19, 20, and 21 replicate genomes for the four populations).

We first assessed differences between populations in sex-specific relative fitness by analyzing larval and adult data separately. We performed ANOVAs of larval and adult fitness with population, sex, and their interaction as independent factors. For larval fitness, we found that across sexes, populations had significantly diverged in fitness (ANOVA on log-transformed data; population term: F3,146 = 4.8, P = 0.003), but that this divergence did not differ between the sexes (population-by-sex interaction term: F3,146 = 1.5, P = 0.22; Fig. 1A). For adult fitness, we observed significant between-population divergence in fitness across both sexes (population term: F3,146 = 5.0, P = 0.002). However, the degree and direction of divergence between populations also differed strongly between the sexes (population-by-sex interaction term: F3,146 = 12.4, P < 0.0001; Fig. 1B). As male and female relative fitness values in the larval and adult stages are all standardized to an average of unity, the sex effect is not significant in either analysis (log-transformed larval fitness: F1,146 = 0.03, P = 0.85; adult fitness: F1,146 = 0, P = 1).

Figure 1.

Mean male and female fitness in the four experimental populations. The figure shows larval (A), adult (B), and total fitness (C) in males and females of each population. Lines connect male and female fitness values from the same population, vertical bars indicate standard errors. Mean values for populations 1–4 are represented as circles, squares, diamonds, and triangles, respectively.

We also applied the same ANOVA model to total fitness (Fig. 1C). This analysis showed that across both life stages, populations had diverged in a sex-specific manner and differed strongly in average sex-specific total fitness (ANOVA on log-transformed data; population-by-sex interaction term: F3,146 = 8.8, P<0.0001). In contrast, neither populations nor the sexes differed in average total fitness (population term: F3,146 = 1.2, P = 0.31; sex term: F1,146 = 0.001, P = 0.97).

To better illustrate how populations diverged in fitness, we performed a PCA of the measures of male and female total fitness. The two axes generated by this analysis provide an intuitive interpretation of fitness variation (Fig. 2). The major axis, capturing 61% of the variation, expresses the position of TGs along an antagonistic continuum between high male/low female fitness and high female/low male fitness. The minor axis, capturing the remaining 39% of variation, expresses the overall, sexually concordant, quality of genomes (Fig. 2). Separate ANOVAs on the principal component scores of the genomes on the two axes showed that populations differed significantly in their score on the first, sexually antagonistic, axis (F3,73 = 7.3, P = 0.0002). In contrast, populations did not differ significantly in their scores for the second, sexually concordant, axis (F3,73 = 2.1, P = 0.11).

Figure 2.

Male and female total fitness of individual target genomes. Datapoints from populations 1 to 4 are represented as circles, squares, diamonds, and triangles, respectively. The direction of the two arrows indicate the orientation of the two principal component axes PC1 and PC2 and the relative length of the two arrows is proportional to the proportion of variance captured.

We also captured sexually antagonistic effects in a PCA performed on the four individual fitness measures, male and female larval and adult fitness. Here, we detected population differences in the scores along the first axis, capturing a net negative effect of male fitness and a net positive effect of female adult fitness (F3,73 = 14.1, P < 0.0001; see Table 1, PC1 for axis loadings). We furthermore observed differences between populations along the second axis that revealed effects that were not visible in the analysis of total fitness, namely a negative correlation between larval and adult fitness (F3,73 = 6.0, P = 0.001; Table 1, PC2).


The results we present here demonstrate that small and independently evolving populations can diverge significantly in their sex-specific fitness. Importantly, our data suggest that this divergence can occur in a sex-specific manner (PC2 in Fig. 2) rather than by populations decreasing in overall fitness due to the fixation of deleterious mutations. More precisely, populations fall along a sexually antagonistic fitness continuum. Thus, populations that have a higher fitness in one sex tend to show a lower fitness in the other, with minimal population differences in the average fitness across both sexes.

The way in which populations have diverged on a continuum between high male/low female and low male/high female fitness is consistent with the interpretation that population differentiation mainly occurs through changes in the frequency of sexually antagonistic alleles. Thus, some populations would have increased in the frequency of female-beneficial/male-detrimental alleles, while other became enriched for male-beneficial/female-detrimental alleles. Allele frequency changes have most likely occurred through random genetic drift because the populations have evolved under tightly controlled and standardized conditions. Drift could have occurred as founder events at the initial establishment of the small experimental populations from the large and genetically diverse Dahomey stock. But stochastic changes may also have taken place subsequently, during the many generations that the populations were maintained at a small effective population size.

Although it is clear that genetic drift will affect the evolution of phenotypic traits in small populations, fitness is by definition under strong selection. From that perspective, the large differences in sex-specific performance we observe between populations (Fig. 1) may seem surprising. The rapid divergence in sex-specific fitness is, however, in line with theory predicting that sexually antagonistic variation should be highly sensitive to genetic drift (Connallon and Clark 2012). One reason for this is that opposing fitness effects in males and females can result in weak net selection across the sexes, meaning that mutations can be almost neutral despite having strong effects on the fitness of each sex. Quasineutral variation of this type can persist for long periods of time in populations with large effective sizes, but will erode rapidly when subjected to more intense genetic drift. Fixation of antagonistic alleles will then reveal their fitness effects and lead to potentially large differences in male and female fitness between populations, such as those observed here. This situation contrasts with polymorphism under sexually concordant selection. Here, classical theory predicts that genetic variation will only persist for appreciable amounts of time (rather than being eliminated rapidly by selection) if the product of effective population size and selection coefficient is smaller than unity (Nes < 1). This means that the level of fitness variation that can be maintained under mutation-selection balance in large populations is small and the fitness effects of deleterious mutations that escape efficient counterselection even in small populations are weak (1–1.25% fitness reduction in the populations studied here). It therefore seems unlikely that standing variation in sexually concordant deleterious alleles could generate the divergence in fitness between populations observed here.

Similar to selection on deleterious alleles with sexually concordant effects, selection is relatively efficient at removing deleterious mutations with sex-limited effects. Although the intensity of selection on these mutations is halved compared to sexually concordant alleles (because they are only expressed in one sex), standing variation of such alleles in the large Dahomey population should be low. They would only evolve neutrally in our replicate populations if their negative effect on fitness is smaller than 2–2.5%. The relatively effective selection against these mutations suggests that the fixation of deleterious mutations with sex-limited effects is unlikely to have created the fitness patterns we describe. In theory, the fixation of deleterious recessive mutations with male-limited effects in some populations, and with female-limited effects in others could give rise to fitness patterns comparable to those we observed. However, the up to 1.5-fold divergence in fitness between our populations (Fig. 1) would require the differential fixation of a significant number of deleterious recessives. It would then seem unlikely that their cumulative effect in a given population would be highly biased toward one sex (many female-limited mutations in some populations, many male-limited in others), rather than affecting both sexes to similar degrees. In addition, it is not clear how frequent deleterious mutations with sex-limited effects are. Results from D. melanogaster suggest that deleterious mutations tend to have sexually concordant fitness effects (Mallet and Chippindale 2011; Sharp and Agrawal 2012).

We can use our results to draw some inferences about the genetics of fitness. In particular, we can compare the patterns of fitness divergence between lines and across different life stages. Similar to Chippindale et al. (2001), we found that divergence between populations in larval fitness was sexually concordant (Fig. 1A), indicating that populations accumulated alleles that were either generally beneficial or generally deleterious to larval performance, independently of sex. This is in line with the view that juveniles do not have differentiated sex roles and accordingly mutations will impact the fitness of males and females to a similar extent by either increasing or decreasing larval performance. In adults, in contrast, significant sexually antagonistic effects were observed in the adult stage, reflecting the difference between male and female reproductive roles (a significant sex-by-population interaction; Fig. 1B). Although population fitness diverged in a sex-specific manner, small differences in average fitness were also apparent (a significant population effect). Interestingly, these disappeared when comparing total fitness between populations (Fig. 1C). This indicates that differences in overall adult fitness and differences in overall larval fitness cancelled each other out. This was also supported by the PCA of individual fitness components, where one axis comprised negative loadings for larval fitness components, but positive loadings for adult fitness (Table 1, axis 2). The association between increased larval and decreased adult fitness suggests that sexual antagonism over male and female adult phenotypes is overlaid by adaptive conflict over optimal larval and adult phenotypes. Our data suggest that some genotypes increase larval fitness at the expense of adult performance, while others have the opposite effect.

The populations used here were created as part of a larger experiment and phenotypic data on male reproductive morphology (testes and accessory size) and male and female wing size had been obtained after about 30 generations of evolution (Reuter et al. 2008). Comparing the fitness data obtained here to these phenotypic measures obtained 50 generations earlier showed a significant correlation between average testis size and average male adult fitness (Pearson's product moment correlation, r = 0.98, t2 = 6.35, P = 0.024). No significant correlation was found between average fitness and average accessory gland size (r = 0. 46, t2 = 0.74, P = 0.54) or with average male wing size (r = −0.37, t2 = −0.57, P = 0.63), nor between average female fitness and average female wing size (r = 0.10, t2 = 0.14, P = 0.90). It is clear that these correlations, each calculated from four datapoints and measures obtained 50 generations apart, cannot provide strong support for the presence or absence of associations between phenotype and fitness. If real, however, a correlation between male fitness and testis size would imply an interesting association between antagonistic effects and the expression of a sex-limited trait. This would suggest that antagonism can arise through the action of developmental mechanisms that are shared between the sexes but involved in the growth of sex-limited structures such as male testes.

In our study, we have observed divergence in sex-specific fitness between laboratory populations of small size. It is likely that similar processes occur in natural populations. Genetic drift in sexually antagonistic genetic variation could lead to divergence in sex-specific fitness between local breeding groups. This kind of fitness divergence could have an impact on the evolutionary dynamics of sexually antagonistic loci, for example, if local populations with low female fitness suffered an increased risk of extinction. Such effects would add a selective pressure against female-detrimental alleles across the metapopulation and shift the conditions for the invasion and maintenance of sexually antagonistic alleles. In the future, it would be interesting to develop models that put sexual antagonism into a more ecological context. These could generate predictions for the dynamics of antagonistic alleles in structured populations that take into account the effect of drift on local allele frequencies, as well as the link between sex-specific fitness and the survival and productivity of breeding groups.


We are indebted to S. Kejriwal for the many hours of laboratory work that he contributed to this project. J. Collet, S. Fuentes, C. Mullon, A. Pomiankowski, and three anonymous reviewers provided helpful comments on the manuscript. We acknowledge funding from the Biotechnology and Biological Sciences Research Council (Ph.D. studentship to JH) and the Natural Environment Research Council (fellowship NE/D009189/1 to MR and research grant NE/G019452/1 to MR and KF).