GENE DUPLICATION IN THE EVOLUTION OF SEXUAL DIMORPHISM
Males and females share most of the same genes, so selection in one sex will typically produce a correlated response in the other sex. Yet, the sexes have evolved to differ in a multitude of behavioral, morphological, and physiological traits. How did this sexual dimorphism evolve despite the presence of a common underlying genome? We investigated the potential role of gene duplication in the evolution of sexual dimorphism. Because duplication events provide extra genetic material, the sexes each might use this redundancy to facilitate sex-specific gene expression, permitting the evolution of dimorphism. We investigated this hypothesis at the genome-wide level in Drosophila melanogaster, using the presence of sex-biased expression as a proxy for the sex-specific specialization of gene function. We expected that if sexually antagonistic selection is a potent force acting upon individual genes, duplication will result in paralog families whose members differ in sex-biased expression. Gene members of the same duplicate family can have different expression patterns in males versus females. In particular, duplicate pairs containing a male-biased gene are found more frequently than expected, in agreement with previous studies. Furthermore, when the singleton ortholog is unbiased, duplication appears to allow one of the paralog copies to acquire male-biased expression. Conversely, female-biased expression is not common among duplicates; fewer duplicate genes are expressed in the female-soma and ovaries than in the male-soma and testes. Expression divergence exists more in older than in younger duplicates pairs, but expression divergence does not correlate with protein sequence divergence. Finally, genomic proximity may have an effect on whether paralogs differ in sex-biased expression. We conclude that the data are consistent with a role of gene duplication in fostering male-biased, but not female-biased, gene expression, thereby aiding the evolution of sexual dimorphism.
The transcription of many genes differs between males and females, and this sexual dimorphism is widespread across the genomes of many taxa (Ellegren and Parsch 2007). However, the evolution of sexual dimorphism presents a practical difficulty: because males and females share most genes, selection in one sex can cause a correlated response in the opposite sex (Lande 1980). When phenotypic optima differ between the sexes, this correlated response can cause sexual antagonism over the evolutionary fate of those shared genes or traits, depressing overall fitness (Bonduriansky and Chenoweth 2009; van Doorn 2009).
Resolving this genomic conflict between the sexes can occur through mechanisms that rely directly or indirectly upon the major sex chromosomes. X- and Z-linkage can foster sexually antagonistic genetic variation (Gibson et al. 2002 for an X-linked example) and help decrease the intersexual genetic correlation (Chenoweth et al. 2008 for an X-linked example). Unequal X or Z number and hemizygous expression in the heterogametic sex facilitate the spread of sexually dimorphic variation (Rice 1984; but see Fry 2010). More generally, the X and Z can possess sexually dimorphic genes in excess, with the relative enrichment of male- or female-biased genes depending upon the particular taxon (Gurbich and Bachtrog 2008). Even degenerate sex chromosomes such as the Y possess genes for fertility, the essential basis of dimorphism (see reviews Carvalho 2002; Graves 2006).
In addition to directly coding for sexual dimorphism, all sex chromosomes can possess factors that influence sexual fate and activate sex-specific genetic networks, triggering the expression of autosome-based dimorphism. For example, sex-determination pathways can initiate sex-specific splicing that underlies dimorphic phenotypes (Lopez 1998; McIntyre et al. 2006). Alternatively, sex-specific modifications to cis- (or trans-) binding sites of autosomal genes can result in dimorphism (Williams and Carroll 2009). Genomic imprinting may foster dimorphism if allelic expression depends upon the offspring's sex (Day and Bonduriansky 2004; Hager et al. 2008; Gregg et al. 2010). Finally, dimorphic trait expression can evolve if condition-dependent expression varies in a sex-specific manner due to the environment or genes (Bonduriansky and Rowe 2005; Bonduriansky 2007; Wyman et al. 2010). Breeding experiments, quantitative trait loci (QTL) studies, and gene expression data confirm that autosomes can indeed encode a great deal of dimorphism (Reinhold 1998; Parisi et al. 2003; Fitzpatrick 2004; Fry 2010). So although important, direct sex-linkage is not required for the evolution of dimorphism (Mank 2009).
In addition to direct and indirect control by the sex chromosomes, the genome might rely upon gene duplication events that permit the partitioning of male and female expression patterns (Ellegren and Parsch 2007; Connallon and Clark 2011; Gallach and Betran 2011). Gene duplication provides a major source of evolutionary novelty (Ohno 1970), with the spontaneous rate of duplication being high enough to represent a powerful contribution to evolutionary change (Lynch and Conery 2000; Lynch et al. 2008; Watanabe et al. 2009; Lipinski et al. 2011). Duplications produce additional copies of genes whose functions are expected to be initially identical. Although one copy fulfills the ancestral workload, redundancy can release the other copy from its selective constraints. Mutation and selection can result in functional changes introducing a new function or specialization of old functions (Force et al. 1999); alternatively, preexisting allelic variation can spread following duplication (Proulx and Phillips 2006). Thus, by providing extra genetic material, duplication might be the first step in the sex-specific specialization of genomes (Ellegren and Parsch 2007). Overall, we expect that if sexually antagonistic selection is a potent evolutionary force acting on individual genes, duplication will eventually produce gene family members that have discordant rather than concordant expression patterns.
In support of this idea, duplicates are common among sex-biased genes. Sex-biased gene expression itself suggests a past history of sex-specific or sexually antagonistic selection (Zhang et al. 2004; Connallon and Knowles 2005; Proeschel et al. 2006; Mank and Ellegren 2009; Innocenti and Morrow 2010; Wyman et al. 2010). When paralogs (i.e., within-species duplications) differ in expression, one or both of the sexes may have co-opted a gene copy to their own purposes in response to such selection. Alternatively, when paralogs have the same sex-biased expression type, selection may allow one copy to obtain even greater sex-bias than had previously been present in the original gene. In the fly and worm genomes, male-biased genes have more paralogs compared to unbiased genes (Cutter and Ward 2005; Gnad and Parsch 2006). In Drosophila, primary spermatocytes express basal transcription factors that are paralogs of conserved transcription factors that are expressed throughout the entire body (Li et al. 2009). In addition, male-biased functions seem common among duplicates acquired through reverse transcription (Bai et al. 2007, 2008), with sex-specific selection shaping the expression and location of retrotransposed genes (Betran et al. 2002; Meisel et al. 2009; Vibranovski et al. 2009b; Zhang et al. 2010). Although poignant, these examples address neither the generality of this pattern nor the tendency of duplicates to adopt sex-specific patterns of expression. We do not know how often duplicate copies diverge in sex-biased expression patterns. Finally, it is unclear what factors may prevent duplicates from developing sex-specific patterns of expression.
In this study, we used microarray data to assess two aspects of sex-specific patterns of duplicate expression. First, we tested the hypothesis that selection can co-opt duplicates in a sex-specific manner. We analyzed the number of paralog pairs with discordant versus concordant expression patterns. Discordant expression patterns would suggest a role for duplications in the evolution of sexual dimorphism. To corroborate the within-species patterns, we also compared the between-species expression patterns of the singleton ortholog in D. ananasse and the duplicates in D. melanogaster. We also looked at sexual dimorphism in the use of duplicates in the gonads versus the soma. Second, we tested for factors that might be associated with expression divergence. For example, because nucleotide changes can accompany changes in expression patterns (Wagner 2000; Castillo-Davis et al. 2004; Gu et al. 2004; Li et al. 2005), we analyzed whether duplicate genes with discordant expression states have elevated substitution rates relative to duplicate genes with concordant expression patterns. We also looked at genomic relocation to explore potential constraints on the evolution of dimorphism following duplication events. Coregulation due to spatial clustering (Boutanaev et al. 2002; Cusack and Wolfe 2007; Mezey et al. 2008; Kaessmann et al. 2009; Gallach et al. 2010) may inhibit expression divergence between paralogs and thus prevent sex-specific specialization.
Materials and Methods
HOMOLOGY AND SEQUENCE INFORMATION
We used data from the recently sequenced 12 Drosophila genomes to identify genes that are paralogous in Drosophila melanogaster (Drosophila 12 Genomes Consortium 2007). We used only gene families containing exactly two members in D. melanogaster and excluded gene families containing more than two genes in our analyses. We used both D. melanogaster lineage-specific paralogs and D. melanogaster paralogs present in multiple Drosophila species. To find the lineage-specific duplicates, we used phylogenies available from the Hahn laboratory website (http://sites.bio.indiana.edu/∼hahnlab/Databases.html) and selected duplicates for which duplication appeared to have occurred after D. melanogaster diverged from the D. sechellia and D. simulans clade (Hahn et al. 2007). If the lineage-specific paralog families had more than two members, we included only the two duplicates at the tip of the gene tree. For all families, we removed genes that were potentially misidentified as paralogous; for example, a duplicate pair of genes could have different gene names but they might still share a common secondary Flybase ID. To prevent this ambiguity from biasing our results, such pairs were removed. If the gene had alternative splice forms, we used only the longest one.
For each duplicate pair, we aligned the sequences against each other using T-Coffee (Notredame et al. 2000) and calculated the number of substitutions per silent (S) and replacement (R) site between the two sequences using PAML (Yang 1997). For each pair of duplicates, we also calculated divergence in the region 1 kb upstream of the start codon. To do this, we extracted the 5′ untranslated regions (UTRs) for each gene and then used the intergenic DNA to obtain additional upstream DNA. Using sequence information from FlyBase, the 5′ UTRs were designated as the region between the maximum location of the gene to the start codon for genes located on the positive strand (or from the minimum location to the start codon for genes on the negative strand). The 5′ UTR and intergenic DNA were masked using RepeatMasker (Smit et al. 2004) and the repeat and low complexity regions were removed. We aligned these upstream regions between the duplicate copies and calculated divergence using the Tajima–Nei correction method in the distmat function of the EMBOSS toolkit (Rice et al. 2000). To account for the neutral mutation rate, we divided the per-site divergence estimates by the value of S calculated from the corresponding coding region.
SEX-BIASED GENE EXPRESSION
Each pair of duplicate genes was put in one of six categories based on their joint expression pattern. Expression status could be the same between duplicate copies: both copies unbiased (UU), female biased (FF), or male biased (MM). Alternatively, expression could be dissimilar between copies: one copy unbiased and one female biased (UF), one copy unbiased and one male biased (UM), or finally, one copy male biased and one female biased (MF). To assign duplicate pairs into one of these groups, we used expression data from two studies (Ayroles et al. 2009; Wyman et al. 2010), which were chosen for their large sample sizes of male and female whole-body hybridizations. The detection of sex-bias depends on the statistical approach and power of a given experimental design, which may partly explain differences among studies in the number of sex-biased genes identified. Furthermore, because evolutionarily recent gene duplicates are expected to be more similar in expression, minor differences in expression are more likely to be detected when sample size is large. Next, we used the fact that these two studies did not always agree in their classifications of unbiased, male-biased, and female-biased expression for each gene as a way to understand expression variation. We compared how often expression patterns were similar between the Ayroles et al. (2009) and Wyman et al. (2010) datasets for duplicate versus singleton genes. If duplicates can be used for sex-specific expression and if duplicates in general show greater ability to vary in expression, these two studies should disagree more often in their categorizations of sex-bias in the duplicate genes than in the singletons. This variation is potentially interesting because selection can act upon it, leading to the evolution of new expression patterns between paralogs.
We recognize an important point regarding the limitations of microarray expression data for duplicates. If related duplicates have very similar sequences, the microarray probes may bind to the DNA from all recent duplicates, yielding an averaged expression value and inaccurate extent of sex-bias for that duplicate. The inability of microarrays to distinguish minor sequence differences will underestimate the frequency with which duplication results in sex-specific expression. This makes our analyses conservative with respect to the role of duplications in the evolution of sexual dimorphism. However, we believe that cross-hybridization is a minor concern for our analysis: when using an array designed for one species with cDNA from another species, sequence divergence as low as ∼1% will regularly show significant expression differences (Gilad et al. 2005). Expression divergence is only expected to increase with sequence divergence. Because the average sequence identity between paralogs (calculated as the percent of sites that were identical) in our study was ∼48%, there should be even greater binding specificity to the appropriate spot for related paralogs within species than for orthologs from two different species. Finally, although categorizing duplicates on the basis of their sex-bias is a coarse measure of their functional divergence, this will again lead to an underestimate rather than overestimate of the amount of sex-specific specialization.
Finally, we looked more closely at the potential transitions in sex-biased expression between singletons from an outgroup species, D. ananassae, and paralogs from D. melanogaster to see whether between-species patterns corroborated the within-species patterns. We found duplicates specific to the melanogaster-subgroup (clade containing D. yakuba, D. erecta, D. melanogaster, D. simulans, and D. sechellia). We chose young gene families to limit the effect that factors such as expression drift and turnover in the pattern of sex-specific selection might have on older paralogs. All expression data came from species-specific microarray data previously published for D. melanogaster and D. ananassae (Zhang et al. 2007); this analysis is independent of the analyses based upon the Ayroles et al. (2009) and Wyman et al. (2010) data.
LACK OF DIFFERENTIATION BETWEEN DUPLICATES
Many duplicate pairs had concordant expression patterns, and hence, no evidence for sex-specific expression divergence. We investigated genomic location of duplicates as a potential reason for the lack of expression divergence. Tandem or segmental duplications are more likely to result in duplicates that are under the control of the same regulatory elements, constraining the evolution of new expression patterns. By contrast, if a duplicate copy moves from the original location to new location, expression may be free to diverge. To test for this possibility, we analyzed the number of chromosomal arm relocations in gene pairs with the same versus different expression patterns. The locations of all paralogs were obtained from FlyBase by parsing FASTA file headers.
SEX-BIASED EXPRESSION AMONG SINGLETONS AND DUPLICATES
The Ayroles et al. (2009) dataset categorized a greater proportion of singletons as sex-biased (88%) compared to the Wyman et al. (2010) dataset (57%) (Table 1). Interestingly, the proportion of sex-biased genes was similar for the singletons and duplicates within each dataset: sex-biased genes comprised 87% of the duplicate genes in the Ayroles et al. data and 57% of the duplicates in the Wyman et al. data (Table 1). The greater number of sex-biased genes in the Ayroles et al. data is due to the detection of more female-biased genes with relatively weak sex-bias (Table 1 and Figs. S1 and S2). This detection resulted in a greater number of female-biased than male-biased duplicates (Table 1) in the Ayroles et al. (2009) study.
Table 1. Gene frequencies among singletons and duplicates. Both datasets show that the relative proportions of unbiased genes and sex-biased genes are the same among the singletons and among the duplicates. However, the relative proportions of the male-biased and female-biased genes are different. Both datasets show that duplicates have a higher proportion of male-biased genes and a lower proportion of female-biased genes compared to the singletons.
|Unbiased|| 814 (12%)||121 (13%)||2850 (43%)||387 (43%)|
| Female-biased || 4303 (66%) || 459 (51%) || 2421 (36%) || 197 (22%) |
|Male-biased||1456 (22%)||318 (35%)||1368 (21%)||310 (35%)|
Although the proportion of sex-biased genes was similar between the duplicate and singleton pools, the relative proportion of male-biased and female-biased genes was not (Table 1). Both datasets had disproportionately more male-biased genes among the duplicates than among the singletons (binomial tests: Ayroles et al.: χ2df = 1= 61.41, P < 0.001; Wyman et al.: χ2df = 1= 74.35, P < 0.001) and disproportionately fewer female-biased genes among the duplicates than among the singletons (binomial tests: Ayroles et al.: χ2df = 1= 21.1, P < 0.001; Wyman et al.: χ2df = 1= 47.1, P < 0.001). By contrast, the relative proportion of unbiased genes did not differ significantly between the singletons and duplicates for either of the datasets.
SEX-SPECIFIC EXPRESSION DIVERGENCE BETWEEN DUPLICATES
Among all of the paralog pair types (Table 2), we find that nearly half have discordant expression patterns: 44.3% in the Ayroles et al. (2009) and 49.8% in the Wyman et al. study. This suggests that there is potential for duplication to facilitate the evolution of dimorphism.
Table 2. The observed and expected frequency of duplicate pair types. We randomized gene pairs to construct a null distribution to test against; see text for details. Percentages indicate the fractional abundance of that pair type within each dataset; parentheses are absolute numbers. Ayroles et al.: χ2df = 5= 50.9, P < 0.0001; Wyman et al.: χ2df = 5= 51.7, P < 0.0001.
|Female-biased, Female-biased||34% (151)||26%|| 9% (42)|| 5%|
| Male-biased, Male-biased || 18% (83) || 12.5% || 17% (78) || 12% |
|Unbiased, Unbiased|| 4% (16)|| 2%||23% (104)||30%|
| Female-biased, Unbiased || 10% (47) || 14% || 15% (69) || 19% |
|Male-biased, Unbiased|| 9% (42)|| 9.5%||25% (110)||19%|
| Female-biased, Male-biased || 24% (110) || 36% || 10% (44) || 15% |
Because sex-biased gene expression can evolve rapidly within species (Meiklejohn et al. 2003; Ranz et al. 2003; Gibson et al. 2004; Baker et al. 2007), it is hard to discern an accurate null expectation for whether discordant pairs are overrepresented relative to concordant pairs. Following a duplication event, the daughter gene may have the same or different expression bias as the parent gene copy. In addition, one or both copies may undergo several subsequent transitions in sex-biased expression. It is currently unclear what these transition probabilities are and modeling such conditional probabilities is beyond the scope of this study. Rather, we sought to construct a relatively assumption-free null distribution by using a randomization approach similar to that used by Mikhaylova et al. (2008). We randomly constructed new gene pairs from the actual pool of duplicates and tabulated the resulting expression pair types (e.g., MM, UM, MF, etc.). We used the averaged proportions of the expression pair types from 10,000 randomizations as our null distribution. We find that FF, MM, and UU expression pairs are overrepresented in the Ayroles et al. (2009) dataset whereas only FF and MM pairs were overrepresented in the Wyman et al. (2010) dataset. Both datasets showed a deficit of UF and MF gene pairs (Table 2).
VARIATION OF SEX-BIAS IN SINGLETONS AND DUPLICATES
We observed that singleton genes have higher consistency in expression pattern between two different microarray datasets than duplicates genes do. The Ayroles et al. (2009) and Wyman et al. (2010) studies identified identical expression patterns for 69% of singletons (3628 out of 6119 genes), but the identical expression patterns for only 44% of genes that have paralogous copies (491 out of 884 genes). This is a significant difference (binomial test: χ2df = 1= 4.33, P= 0.038) and suggests greater expression variability across studies for duplicates than singletons.
DUPLICATES IN THE MELANOGASTER SUBGROUP
We identified 15 duplicate families possessing exactly two paralogs that are specific to the melanogaster subgroup. We compared the statistically quantified expression status of the D. ananassae singleton orthologs to the expression status of the related D. melanogaster paralogs (Table 3). We find that the between-species pattern supports the within-species pattern: unbiased- and male-biased singletons potentially give rise to male-biased genes in a disproportionate manner. By contrast, female-biased genes appear to be less common in the pool of singletons and paralogs, at least for this set of genes.
Table 3. Sex-bias in an outgroup species and in the melanogaster subgroup. Ten of the 15 gene families had unbiased expression in the Drosophila ananassae singleton ortholog. Of these 10 orthologous groups, five families had male-biased expression in one of the D. melanogaster paralogs.
| unbiased || unbiased, unbiased || 5 |
| male-biased || unbiased, male-biased || 1 |
| female-biased || unbiased, unbiased || 1 |
We compared the sex-bias of genes for recent D. melanogaster specific duplicates to older nonlineage specific duplicates. This can reveal whether more recent duplicate members share more similar expression patterns than older duplicates. We used the D. melanogaster lineage-specific duplicates found by another study (Hahn et al. 2007); if a duplicate family specific to D. melanogaster had multiple members, we used only the two most recent genes and assigned an expression status to the lineage-specific duplicates. Because of differences in platforms and genome annotation at the time of the studies, the Wyman et al. (2010) dataset had expression information available for only 23 lineage-specific duplicate pairs whereas the Ayroles et al. (2009) dataset had expression for 50 duplicate pairs (Table 4). Both datasets indicate that pairs derived from recent lineage-specific duplications more often have concordant rather than discordant patterns of expression. In the Ayroles et al. data, 86% of pairs were concordant among the lineage-specific duplicates, whereas 52% were concordant among the nonlineage-specific duplicates (binomial test: χ2df = 1= 19.6, P < 0.0001). In the Wyman et al. data, 61% of pairs were concordant among the lineage-specific duplicates; 46% were concordant among the nonlineage-specific duplicates, but this difference was not significant (binomial test: χ2df = 1= 0.71, P < 0.39). For S < 0.25, saturation effects are minimal and S is a reasonable proxy for the age of a duplication event (Lynch and Conery 2000, 2001). For this class of duplicates, we confirmed that the lineage-specific duplicates used in our study are younger (smaller S values) than the nonlineage-specific duplications (F1,31= 7.37, P < 0.011).
Table 4. Drosophila melanogaster duplications. We assigned an expression status to all of the D. melanogaster specific and nonlineage specific duplication events. The frequency of gene pairs with the same expression status is more common in the lineage-specific genes than in the set of duplicate genes that are not specific to D. melanogaster.
|Female-biased, Female-biased||22 (44%)||129 (32%)||7 (30%)|| 35 (8%)|
| Male-biased, Male-biased || 17 (34%) || 66 (17%) || 2 (9%) || 76 (18%) |
|Unbiased, Unbiased|| 4 (8%)|| 12 (3%)||5 (22%)|| 99 (23%)|
| Female-biased, Unbiased || 1 (2%) || 46 (12%) || 3 (13%) || 66 (16%) |
|Male-biased, Unbiased|| 1 (2%)|| 41 (10%)||6 (26%)||104 (25%)|
| Female-biased, Male-biased || 5 (10%) || 105 (26%) || 0 || 44 (10%) |
SEQUENCE DIVERGENCE AND EXPRESSION
Because changes in gene expression might reflect changes in the underlying sequence, we tested the hypothesis that dissimilar expression patterns between duplicate genes are associated with higher substitution rates. The substitutions per replacement site (R), substitutions per synonymous site (S), and divergence (R/S) between the coding regions of the duplicate pairs were analyzed for the six categories. We used S as a proxy for age of the duplication event (Lynch and Conery 2000) (but we note that gene conversion may drive down S, and hence the apparent age, in some gene pairs [Casola et al. 2010]). We noticed that the 516 duplicate pairs in this study fell into roughly three different age groups (Fig. S3). Sixty-six pairs were relatively recent duplications (S < 1); 204 pairs were old duplications (1 < S < 3). The remaining 246 pairs represented extremely ancient duplication events (S > 3). To avoid the problem of saturation in the oldest duplicates, we assessed R for pairs with S≤ 3. We quantified variation in R as a function of S and duplicate pair type (e.g., MM, UM, MF, etc.) in a two-way analysis of variance. S correlated positively with R, as expected, for both datasets (Wyman et al.: F1,191= 70.72, P < 0.0001; Ayroles et al.: F1,191= 70.69, P < 0.0001). However, neither duplicate pair type nor its interaction with S significantly explained variation in R in either dataset.
We also investigated if some of the differentiation in expression bias between pairs could be attributable to cis-regulatory differences by looking at sequence divergence in the 1kb region upstream of the start codon for each duplicate gene. There was no difference in the divergence rates of the upstream region (corrected by the coding region S) among paralog pair expression types for either the Ayroles et al. (2009) or the Wyman et al. (2010) datasets (both P > 0.10). There was also no difference in the divergence rates between gene pairs with concordant versus discordant expression.
We calculated the degree of sex-biased expression (SB = Male / (Female + Male)) for all duplicate genes and looked at the relative contribution of S and SB of one duplicate on SB of the related duplicate. Duplicate genes in a family were randomly assigned to be either the independent or dependent variable in a multiple regression analysis (results did not differ when designations were switched). Members of a paralog family correlate positively for SB (Wyman et al.: F1,229= 84.26, P < 0.0001; Ayroles et al.: F1,205= 70.26, P < 0.0001). This is consistent with the overall observation that related duplicates can share categorical expression status (Table 2). There was a significant negative interaction between S and SB of the duplicate genes assigned as an independent variable (Wyman et al.: F1,229= 19.61, P < 0.0001; Ayroles et al.'s: F1,205= 15.59, P= 0.0001). As S increases, the correlation of SB between duplicates decreases. This is consistent with the observation that duplicate pairs with low S are more likely to share the same expression status (Table 4).
We found evidence that genomic relocation affects the probability that gene members of a duplicate family retain the same expression pattern. In the Ayroles et al. (2009) dataset, members of a duplicate pair located on different chromosomal arms are more likely to have a different expression pattern than the same expression pattern (binomial test: 76 gene pairs with different expression out of 122 pairs; P= 0.008). In addition, members on the same chromosomal arm are more likely to have the same expression pattern than a different expression pattern (binomial test: 204 gene pairs with same expression out of 327; P < 0.001). These patterns were not statistically significant in the Wyman et al. (2010) dataset, but were nominally consistent with results from the Ayroles et al. (2009) dataset.
Oftentimes, autosomal paralogs show increased male-biased expression whereas the related X-linked paralogs show decreased male-biased expression (Betran et al. 2002; Wu and Xu 2003). It is possible that the effect of genomic location on expression divergence is driven by the presence of X-autosome duplicate pairs. To test this possibility, we removed such gene pairs from the analysis. The Ayroles et al. (2009) dataset showed that related duplicates located on the same chromosomal arms were still more likely to have the same, rather than different expression pattern. (binomial test: 167 gene pairs with same expression out of 275; P= 0.0004). We also looked at the role of retrotransposition on the expression patterns; because regulatory regions are not usually copied during retrotransposition, daughter duplicates might be more likely to acquire new expression patterns. After we removed 29 gene families that had retroposed duplicates, genomic location had no effect on the probability of having the same expression pattern.
TISSUE DIFFERENCES IN DUPLICATE USE
We looked at sexual dimorphism in the use of duplicates in sex-specific tissues. We used expression comparisons of gonads and gonadectomized males and females (Parisi et al. 2004). Genes were considered gonad- or “soma-” (whole body minus gonads) specific if there was at least a two-fold expression difference between the two tissues; two-fold cutoffs mimic patterns seen at higher threshold cutoffs (Wyman et al. 2010). Using this designation, we found that among all duplicates, 138 were male-soma-specific, 164 testis-specific genes, 79 female-soma-specific, and 10 ovary-specific genes; among these tissues, 8, 18, 3, and 1 duplicates, respectively, were retrogenes (according to those identified in Vibranovski et al. 2009a). Among the 1:1 singleton orthologs (shared among the 12 sequenced Drosophila species), we observe the same rank order number of genes as the duplicates. In D. melanogaster, there are 631 male-soma-specific 1:1 orthologs, 760 testes-specific orthologs, 287 female-soma-specific orthologs, and 164 ovary-specific orthologs. The proportion of ovary-specific genes is smaller among the duplicates than among the 1:1 orthologs (χ2= 24.28, P < 0.0001). By contrast, the proportion of testes-specific genes is the same between the duplicates and orthologs (χ2= 0.14, P= 0.91). Finally, among the duplicate genes, there were pairs with concurrent tissue-specificity: 28 male-soma-specific families, 35 testis-specific duplicate families, 15 female-soma-specific families, and one ovary-specific family.
Gene duplication is a key step in the evolution of functional novelty. Here, we have asked whether the sexes can use additional gene copies to facilitate the evolution of sexual dimorphism. This study represents a first general attempt at describing sex-specific expression by explicitly examining how paralogs are partitioned between the sexes across the genome. We find that some patterns of sex-biased expression of duplicate gene pairs in D. melanogaster are consistent with this process. Previous studies have shown that duplications are associated with the evolution of male-biased gene expression (Cutter and Ward 2005; Gnad and Parsch 2006); we find additional support for this pattern through the excess of MM and UM paralog pairs. By contrast, female-biased genes are not as common among duplicates, with MF and UF paralog pairs being underrepresented. Furthermore, we find that the patterns between species support a pattern of mainly unbiased- or male-biased genes giving rise to more male-biased genes. Female-biased genes are nearly absent among the singleton orthologs looked at in this study (Table 3). We have corroborated prior reports that the testes appear to requisition duplicates readily (Parsch et al. 2005; Belote and Zhong 2009; Gallach et al. 2010). We have extended these previous studies by showing that the male body in general can co-opt duplicates; even the male-soma has more duplicate use compared to the female-soma or the ovaries. By looking at how related paralogs are partitioned between the sexes, we have confirmed that duplicates are more commonly used for male-biased expression and less commonly used for female-biased expression. These observations are consistent with the notion that gene duplication can provide the raw materials to relieve intralocus sexual conflict.
SEXUAL DIMORPHISM IN PARALOG EXPRESSION
Although recent evidence suggests that not all sex-biased genes currently experience intralocus antagonism (Innocenti and Morrow 2010), it is hard to ascertain what proportion of sex-biased genes has experienced sexual conflict in the past. At least some of the antagonism could have been relieved by duplication events that permit sex-specific specialization. Our data show that ∼50% of all duplicate pairs have discordant expression patterns (Table 2), which implies that intralocus sexual conflict could have had a large impact on duplicate expression divergence. We demonstrate that paralog coregulation patterns can be looked at between the sexes, much like they have been analyzed for the testis versus other tissues (Mikhaylova et al. 2008). We find that sex-specific duplicate use may have occurred through the increase in the number of male-biased genes, and a decrease in the number of female-biased genes. The pool of duplicates has proportionally more male-biased genes and fewer female-biased genes compared to the pool of singletons (Table 1). We note that the Ayroles et al. (2009) dataset has a greater number of female-biased duplicates as compared with previous studies (Gnad and Parsch 2006). This is in part due to the detection of more genes with weak female-bias in the Ayroles et al. (2009) study compared to the Wyman et al. (2009) study (Figs. S1 and S2).
Overall, these results are consistent with previous studies that show male-biased and male-specific genes are associated with duplications (Cutter and Ward 2005; Gnad and Parsch 2006; Bai et al. 2007; Belote and Zhong 2009; Li et al. 2009). Interestingly, UF and MF pairs were underrepresented, suggesting that duplicate use for female functions is uncommon. Female-biased duplicates may be eliminated by selection more often (Zhang et al. 2007) or may have fewer opportunities to proliferate via mutation (Cardoso-Moreira and Long 2010). However, the presence of UM pairs may yet represent female specialization through the lack of differentiation. Many sexually dimorphic traits manifest their naturally selected (monomorphic) state in females whereas only males in high condition express the sexually selected (dimorphic) state (e.g., Rowe and Houle 1996; Bonduriansky and Rowe 2005; Wyman et al. 2010). Many studies have concluded that males and male-related functions appear to experience sex-specific selection more often and more strongly among individual genes (Meiklejohn et al. 2003; Hambuch and Parsch 2005; Proeschel et al. 2006; Ellegren and Parsch 2007; Zhang et al. 2007, 2010) and among phenotypic traits (Darwin 1871; Andersson 1994; Hoekstra et al. 2001; Kingsolver et al. 2001) than females and female-related functions. Sexual selection likely drives both patterns (Singh and Kulathinal 2005).
DUPLICATE EXPRESSION IN TISSUES
The degree of sex-biased expression correlates negatively with tissue breadth; sex-biased genes may have pleiotropic consequences limiting their widespread expression (Mank et al. 2008). It is therefore conceivable that duplication can mitigate sexual antagonism by providing extra gene copies that can spatially or temporally specialize in male or female functions without disrupting existing expression networks (Force et al. 1999; Gu et al. 2004; Huminiecki and Wolfe 2004; Gallach and Betran 2011; but see also Hosken 2011). For example, in D. melanogaster, 12 of 33 proteins that make up the proteasome (i.e., protein-degrading machinery) expressed in male testes are paralogs of genes that have much broader expression (Belote and Zhong 2009). We confirm that more duplicate genes and gene families are requisitioned for specialization in the testes (Betran et al. 2002; Bai et al. 2008; Mikhaylova et al. 2008; Belote and Zhong 2009; Vibranovski et al. 2009b), a pattern also found in mammals (Kaessmann 2010). However, there are also more duplicates used in the male-soma compared to the female-soma and ovaries. If duplication can mitigate the constraints of pleiotropy, it is surprising that the female body does not use duplicates to a greater extent. It may be that the stronger sexual selection on male function in D. melanogaster explains this pattern. These conclusions are tentative and require further functional characterization of duplicates (Gallach and Betran 2011), especially because we know that the female reproductive tract of other Drosophila species (Kelleher and Markow 2009; Kelleher and Pennington 2009) appears to requisition duplicates in the context of sexual antagonism.
Although we are interested in the role of duplications in intralocus sexual conflict, we acknowledge that sexual antagonism is only the most general explanation for expression divergence. Two particular examples of differential sex-specific selection include meiotic sex chromosome inactivation (MSCI) and dosage compensation. During male meiosis, the X chromosome is transcriptionally silenced, discouraging the buildup of male-specific genes on the X chromosome (Wu and Xu 2003; Hense et al. 2007; Vibranovski et al. 2009a). As such, genes whose retrotransposed copies have left the X are common in flies (Betran et al. 2002; Meisel et al. 2009; Vibranovski et al. 2009b; Zhang et al. 2010) and mammals (Emerson et al. 2004; Vinckenbosch et al. 2006). Some species equalize expression products in the heterogametic sex so that X:A = 1 despite having only one X chromosome. Such dosage compensation may deter the influx of newly duplicated genes onto the X if sexual antagonism over expression is found on the X (Mank et al. 2011). Both MSCI and dosage compensation rely upon the particular selective forces acting upon the sex chromosomes. Yet, sexually antagonistic selection over the expression of duplicates can occur whenever male and female optima differ.
Nonadaptive explanations that can account for expression divergence include differences in expression drift, mutation rates, and dosage sensitivities of paralogs. Although we observed that older paralog pairs have more expression differences compared to younger lineage-specific pairs (Table 4), it is difficult to discern how much divergence may have occurred also by expression drift (Khaitovich et al. 2004). Similarly, the excess of male-biased paralogs may simply be a by product of the higher mutational rates that they experience (Cardoso-Moreira and Long 2010), suggesting that the excess of male-biased duplicates might not require invoking selection. Finally, although duplication may enable sex-biased genes to proliferate, the genome may better tolerate the acquisition of novel sex-bias in duplicates if sex-bias is nonessential (Mank and Ellegren 2009). Such tolerance may be more common in duplicates, which like sex-biased genes show higher expression variance and flexibility (e.g., Gu et al. 2004; Huminiecki and Wolfe 2004). Thus, male-biased duplicate genes may have higher origination or fixation rates, or both (Gnad and Parsch 2006). Disentangling the relative contribution of adaptive and nonadaptive processes to the enrichment of male-biased genes and elucidating the time scales at which such processes operate (Long et al. 2003; Kaessmann et al. 2009; Kaessmann 2010) require further work.
MOLECULAR EVOLUTION OF PARALOG PAIRS
We hypothesized that gene expression differences between paralog members could be explained by underlying differences in sequence divergence. We found little compelling support for this effect. Although several studies have found a correlation between expression divergence and sequence divergence between duplicates, the relationship is not a consistent one and requires additional study (Wagner 2000; Gu et al. 2002; Castillo-Davis et al. 2004; Li et al. 2005). However, we did observe that duplicate pairs found on different chromosomal arms are more likely to have distinct expression patterns. Conversely, duplicate pairs on the same arm tend to have equivalent patterns of sex-bias. These results are consistent with the spatial clustering of coregulated genes and the greater expression divergence among relocated genes found in previous studies (Boutanaev et al. 2002; Miller et al. 2004; Bai et al. 2008; Mezey et al. 2008; Gallach et al. 2010). Yet, because our measure of functional divergence was so coarse (i.e., presence or absence of sex-bias), we still do not know how much divergence exists between genes with concordant expression patterns. Further characterization of individual duplicate pairs is required to assess functional divergence and sex-specific specialization.
We have compared patterns of duplicate use between males and females on a genome-wide scale. Male-biased gene expression is more common than female-biased gene expression among duplicates, opposite to the pattern observed in singletons. Dimorphism can increase through the acquisition or exaggeration of male-bias in paralogs compared to the singleton ortholog. Duplicates are used to a greater extent in the testes and male-soma compared to the ovaries and female-soma. Thus, duplication may primarily aid the evolution of male-biased expression rather than by allocating one gene copy for each sex. Duplication events have the potential to weaken the intersexual genetic correlation, thereby aiding the evolution of dimorphism (Bonduriansky and Chenoweth 2009). However, it is still unclear whether duplication can completely resolve intralocus conflict (Hosken 2011). Furthermore, it is unknown how much sex-bias represents expression optimization. Understanding these processes and factors for duplicated genes may shed light on how dimorphism evolves in more complex polygenic phenotypes (Lande 1980).
Associate Editor: E. H. Morrow
We thank L. Moyle, M.W. Hahn, A.F. Agrawal, M.C. Wyman, and anonymous reviewers for helpful comments and suggestions. We thank G.V. Wilson and J. Montojo for assistance with Python. This work was supported by NSERC grants to L. Rowe and a University of Toronto Connaught Scholarship to M.J. Wyman.