Sexually dimorphic gene expression and transcriptome evolution provides mixed evidence for a fast-Z effect in Heliconius

Sex chromosomes have different evolutionary properties as compared to the autosomes due to their hemizygous nature. In particular, recessive mutations are more readily exposed to selection, which can lead to faster rates of molecular evolution. Here, we report patterns of gene expression and molecular evolution in the sex chromosomes of a group of tropical butterflies. We first improved the completeness of the Heliconius melpomene reference annotation, a neotropical butterfly with a ZW sex determination system. Then we sequenced RNA from male and female whole abdomens and female ovary and gut tissue to identify sex and tissue specific gene expression profiles in H. melpomene. Using these expression profiles we compare sequence divergence and polymorphism, the strength of positive and negative selection and rates of adaptive evolution for Z and autosomal genes between two species of Heliconius butterflies, H. melpomene and H. erato. We show that the rate of adaptive substitutions is higher for Z as compared to autosomal genes, but contrary to expectation it is also higher for male as compared to female biased genes. There is therefore mixed evidence that hemizygosity influences the rate of adaptive substitutions. Additionally, we find no significant increase in the rate of adaptive evolution or purifying selection on genes expressed in ovary tissue, a heterogametic specific tissue. Together our results provide limited support for fast-Z evolution. This contributes to a growing body of literature from other ZW systems that also provide mixed evidence for a fast-Z effect.

Sex chromosomes have different evolutionary properties as compared to the 2 3 autosomes due to their hemizygous nature. In particular, recessive mutations are 2 4 more readily exposed to selection, which can lead to faster rates of molecular 2 5 evolution. Here, we report patterns of gene expression and molecular evolution in 2 6 the sex chromosomes of a group of tropical butterflies. We first improved the 2 7 completeness of the Heliconius melpomene reference annotation, a neotropical 2 8 butterfly with a ZW sex determination system. Then we sequenced RNA from 2 9 male and female whole abdomens and female ovary and gut tissue to identify 3 0 sex and tissue specific gene expression profiles in H. melpomene. Using these 3 1 expression profiles we compare sequence divergence and polymorphism, the 3 2 strength of positive and negative selection and rates of adaptive evolution for Z 3 3 and autosomal genes between two species of Heliconius butterflies, H. 3 4 melpomene and H. erato.

5
We show that the rate of adaptive substitutions is higher for Z as compared to 3 6 autosomal genes, but contrary to expectation it is also higher for male as 3 7 compared to female biased genes. There is therefore mixed evidence that 3 8 hemizygosity influences the rate of adaptive substitutions. Additionally, we find no 3 9 significant increase in the rate of adaptive evolution or purifying selection on 4 0 genes expressed in ovary tissue, a heterogametic specific tissue. Together our 4 1 results provide limited support for fast-Z evolution. This contributes to a growing 4 2 body of literature from other ZW systems that also provide mixed evidence for a 4 3 fast-Z effect. The Classic approach to the MK test is robust to differences in mutation rates 2 8 9 and variation in coalescent histories across genomic locations (McDonald &  2  9  0 Kreitman, 1991). Inference of positive selection using the Classic approach of the 2 9 1 MK test is not robust, however, to the occurrence of slightly deleterious mutations 2 9 2 and demographic change. To account for these confounders, we used a 2 9 3 Modelling approach to estimate the strength of positive and purifying selection in 2 9 4 addition to the Classic approach described above, using the method of Eyre-2 9 5 Walker and Keightley (2009)  The Modelling approach uses the frequency distribution of polymorphism to 2 9 8 assess the distribution of deleterious mutations at functional sites. This 2 9 9 elaborates on the Classic approach of the MK test by modelling the distribution of 3 0 0 fitness effects (DFE) of deleterious non-synonymous mutations as a negative 3 0 1 Gamma distribution. The model is fitted to the synonymous and non-synonymous 3 0 2 site frequency spectra (SFS) and the expected dN/dS under near-neutrality is 3 0 3 inferred. The difference between the observed and expected dN/dS provides an 3 0 4 estimate of the proportion of adaptive non-synonymous substitutions (α). The per 3 0 5 mutation rate of adaptive substitutions is calculated as: and the per mutation rate of non-adaptive substitutions is calculated as: Gene expression level and π n /π s ratios 3 0 9 We calculated reads per kilobase per million (RPKM) as:

‫ݔ‬ ‫ܮ‬
To test whether gene expression level and chromosome type have a significant 3 1 1 effect on π n /π s ratios we used a multiple regression analysis. We established the 3 1 2 linear model: Where N c is the number of reads mapped to the genic feature, 3 1 4 N tot is the total number of reads mapped in the sample, and L c is the length of the 3 1 5 genic sequence in base pairs (Mortavazi et al., 2008). RPKM i is the mean RPKM 3 1 6 of gene i across the 10 individuals. Chromosome type is either autosome or sex 3 1 7 chromosome as assigned in Hmel2 reference genome (Davey et al., 2016) Table S1). We analysed data from two different time points and 3 5 3 from non-sex and sex-specific tissue separately (Treatment: Young and Old).
There is a clear separation of the 25 samples by tissue when we compare gene 3 5 5 expression profiles between them. In total, 51% of the total variance is explained 3 5 6 by the two first principal components. PC1 separates the samples by tissue and 3 5 7 explains 40% of variance. PC2 explains 11% of total variance and separates 3 5 8 samples by age (Supplementary Figure S3). Ovarian tissue clusters by age 3 5 9 more tightly than non-sex specific tissue (Gut) (Supplementary Figure S4A  Coding sequence divergence does not support a significant fast-Z effect 3 6 3 We first compared rates of Z and autosomal sequence divergence using dN/dS 3 6 4 comparisons of 1-1 orthologous genes between H. melpomene and H. erato.
The dN/dS ratio for the Z chromosome genes is not significantly higher than 3 6 6 dNdS for autosomal genes ( More highly expressed genes are more exposed to selection, so in a female 3 7 0 heterogametic organism with a fast-Z effect, genes with female-biased 3 7 1 expression are expected to have higher rates of amino acid substitution if dN/dS 3 7 2 is driven by positive selection. However, the dN/dS ratio of Z female biased 3 7 3 genes ( Significance indicated separately for All and for sex biased (female, male Increased strength of purifying selection on highly expressed genes 4 2 3 Patterns of diversity were however strongly associated with expression levels. 4 2 4 Using a multiple regression approach we found that functional genetic diversity, 4 2 5 π n, was significantly negatively correlated with expression level for both 4 2 6 autosomal (P < 0.01) consistent with increased purifying selection on highly 4 2 7 expressed genes (Supplementary Figure S2). We next explored patterns of adaptive evolution using: 1) the Classic MK test; 4 3 5 and 2) the Modelling approach which accounts for the effect of mildly deleterious 4 3 6 mutations. We computed: 1) the proportion of adaptive non-synonymous the Classic approach (Table 1) biased genes is not consistent with the expectations of fast-Z adaptation, which 4 5 4 would predict faster evolution of female biased genes due to hemizygosity 4 5 5 compared to autosomes, but this could also reflect a lack of power to detect the 4 5 6 signal when the total number of genes is reduced. Similarly, ω α , the rate of adaptive substitution relative to the rate of neutral 4 6 3 divergence, is significantly higher for Z genes (ω

Female ovary-biased and gut-biased genes 4 8 4
Next we explored the expression of genes in female reproductive tissue. Overall 4 8 5 there were a greater number of genes with gut-biased expression (#Gut Auto =153) 4 8 6 than ovary biased expression (#Ovary Auto =40) in the autosomes. However, there 4 8 7 was an over-representation of Z ovary expressed genes than expected by 4 8 8 chance (#Gut Z =6; #Ovary Z =6; chi-square test; P < 0.05). However, the number of 4 8 9 genes in each category is relatively small so these tests should be treated with 4 9 0 caution. 4 9 1 Of the 205 differentially expressed genes between the two tissues only 9 in the 4 9 2 ovaries and 26 in the gut could be used to calculate dN/dS, π n /π s and α . The 4 9 3 other genes either do not have a 1-1 ortholog with H. erato or there were too 4 9 4 many undetermined characters (gaps or Ns) to be able to estimate the 4 9 5 parameters. Of the 35 genes for which molecular evolution statistics could be 4 9 6 calculated, all 9 ovary biased and 25 of 26 gut biased genes are autosomal; and 4 9 7 1 gut biased gene maps to the Z (Gut dN/dS Z =0.365; Gut π n /π s Z =0.093; Gut α 4 9 8 Z =0.295). We did not detect any significant differences in π n /π s ; dN/dS; or α for 4 9 9 autosomal ovary and gut-biased genes. 2016). In Heliconius, although dN/dS was not significantly different between 5 3 5 autosomal and Z genes, we did find evidence for a faster rate of adaptive 5 3 6 substitution. Interestingly, our data also show that dS on the Z chromosome is 5 3 7 higher than dS on the autosomes perhaps indicating a male biased mutation rate, 5 3 8 as Z chromosomes spend more time in males than in females (Miyata et al., 5 3 9 1987). While hemizygosity is expected to expose beneficial mutations to 5 4 0 selection and increase rates of adaptive evolution on the Z chromosome, it is 5 4 1 also expected to increase the efficacy of purifying selection which would act to 5 4 2 reduce evolutionary rates. It may be that the balance between these two forces 5 4 3 differs across lepidopteran species, leading to the mixed pattern of fast-Z 5 4 4 evolution in some taxa but not others. It is important to add, however, that the consequence of reduced effective population size rather than positive selection. 5 5 4 The difference in effective population size between sex chromosomes and 5 5 5 autosomes in female heterogametic systems is predicted to be larger than in 5 5 6 male heterogametic systems due to higher variance of male reproductive 5 5 7 success (Mank, Nam, et al., 2010). Indeed, we estimate that coding regions on 5 5 8 the Z chromosome have an N e 0.44 times that of autosomes. We might therefore 5 5 9 predict a considerable reduction in the efficacy of purifying selection on butterfly 5 6 0 Z chromosomes. This should lead to higher ω na and π n /π s ratios on the Z 5 6 1 compared due to stronger genetic drift. However, as in satyrine butterflies 5 6 2 (Rousselle et al., 2016), in Heliconius, ω na is not higher on the Z relative to 5 6 3 autosomes. dN/dS and π n /π s are higher in the Z relative to autosomes in 5 6 4 Heliconius, but this is not significant. This means that, in contrast to birds, the 5 6 5 difference in the effective population size of the Z relative to autosomes is not 5 6 6 sufficient to reduce the efficacy of purifying selection at a detectable level. 5 6 7 5 6 8 One possible explanation for this difference is the generally much higher effective 5 6 9 population sizes of Lepidoptera, which could allow for efficient selection even on 5 7 0 sex chromosome (Rousselle et al., 2016). Another is that by not using all 5 7 1 genomic sites to estimate the π sZ /π sA diversity ratio, we might be underestimating 5 7 2 its true value due to a stronger effect of background selection. The latter is estimates of the π sZ /π sA diversity ratio are significantly lower than the expected 5 7 7 0.75 and still there is no observable reduction in the efficacy of purifying selection 5 7 8 in H. melpomene. 5 7 9 Another factor that might counteract the fast-Z effect is adaptation from standing 5 8 0 variation. Larger populations are more polymorphic and therefore have an 5 8 1 increased probability of adaption from standing genetic variation. Adaptation from 5 8 2 standing genetic variation is therefore expected to result in faster-autosome 5 8 3 evolution, independent of the dominance of beneficial alleles (Orr and 5 8 4 Betancourt, 2001), which would counteract the fast-Z effect. This may be 5 8 5 especially relevant when overall population sizes are large, as in many 5 8 6 Heliconius species, such that standing variation becomes a comparatively 5 8 7 important source of adaptive variation as compared to de novo mutations. 5 8 8 As sex biased genes tend to be expressed in sex specific tissue such as the 5 8 9 testis and the ovaries we aimed to investigate patterns of molecular evolution in 5 9 0 ovary biased genes. Unfortunately, there are no ovary-biased genes with 1-1 5 9 1 orthologues between H. melpomene and H. erato that are Z. This means we 5 9 2 could not test the effect of hemizygosity on non-sex specific and sex-specific 5 9 3 female expression directly. The lack of 1-1 orthology may mean that these genes 5 9 4 are rapidly evolving, and indeed autosomal ovary-expressed genes do have 5 9 5 higher rates of adaptive evolution than gut expressed genes. 5 9 6 Together these results illustrate the need to study substitution rates in other ZW 5 9 7 systems considering sex biased expression. This genome-wide analysis of 5 9 8 polymorphism, divergence and gene expression data contributes to a growing 5 9 9 body of literature on sex chromosome evolution in ZW systems, and reveals the 6 0 0 complexity of the different evolutionary forces shaping transcriptome evolution in 6 0 1 Heliconius and, consistent with previous work, shows limited evidence of fast-Z 6 0 2 evolution in this taxon. Faster-X Effects in Two Drosophila Lineages, Genome Biology and 6 1 3 Evolution, Volume 6, Issue 10, 1 October 2014, Pages 2968-2982. 6 1 4 6 1 5 Avila V, Campos JL, Charlesworth B (2015). The effects of sex biased gene 6 1 6 expression and X-linkage on rates of adaptive protein sequence evolution in 6 1 7 Drosophila.  to the log2 scale (DESeq2, rlog(blind=FALSE)) separated by tissue. rlog 1 0 3 2 transformed data minimises differences between samples for rows with 1 0 3 3 small counts and normalizes with respect to library size. A. 45% of the 1 0 3 4 variance is explained by PC1 and PC2. PC1 separates young ovary tissue 1 0 3 5 from old ovary tissue and explains 29% of the variance. All the samples 1 0 3 6 cluster by age. B. 39% of the total variance is explained by PC1 and PC2. 1 0 3 7 PC1 separates young gut tissue from old gut tissue and explains 23% of 1 0 3 8 the variance. The samples cluster less tightly by age than ovary 1 0 3 9 expression.