Comparative transcriptomics across 14 Drosophila species reveals signatures of longevity

Summary Lifespan varies dramatically among species, but the biological basis is not well understood. Previous studies in model organisms revealed the importance of nutrient sensing, mTOR, NAD/sirtuins, and insulin/IGF1 signaling in lifespan control. By studying life‐history traits and transcriptomes of 14 Drosophila species differing more than sixfold in lifespan, we explored expression divergence and identified genes and processes that correlate with longevity. These longevity signatures suggested that longer‐lived flies upregulate fatty acid metabolism, downregulate neuronal system development and activin signaling, and alter dynamics of RNA splicing. Interestingly, these gene expression patterns resembled those of flies under dietary restriction and several other lifespan‐extending interventions, although on the individual gene level, there was no significant overlap with genes previously reported to have lifespan‐extension effects. We experimentally tested the lifespan regulation potential of several candidate genes and found no consistent effects, suggesting that individual genes generally do not explain the observed longevity patterns. Instead, it appears that lifespan regulation across species is modulated by complex relationships at the system level represented by global gene expression.

O' Grady, 2007). The entire genus Drosophila contains over 2,000 described species that occupy diverse ecological niches such as forests, deserts, and cosmopolitan areas (Markow & O'Grady, 2005;Schnebel & Grossfield, 1983). With the completion of full-genome sequences of 12 Drosophila species (Clark et al., 2007), researchers are able to explore various aspects of their biology in much greater depth. Examination across multiple evolutionarily related lineages can reveal insights on the unique biology of flies, as well as new themes and biological mechanisms that apply across diverse life forms.
Lifespan, weight, time to maturity, and other life-history traits naturally differ across various Drosophila species as the result of millions of years of natural selection, drift, and adaptation. Since the divergence from a common ancestor, Drosophila lifespan has increased along certain lineages but decreased in others (Schnebel & Grossfield, 1983), indicating that longevity can be modulated in both directions on the evolutionary timescale. The heritability and stability of species lifespan across generations indicate a genetic basis for the longevity determinant(s). By identifying the genes whose expression levels correlate with lifespan across the Drosophila lineage, one may obtain clues about the pathways involved and ultimately the mechanisms through which nature modulates longevity. Such gene expression patterns may also be compared with known lifespanextension strategies to identify commonality in their effects.
Here, we used 14 Drosophila species spanning five taxonomical groups and examined their lifespan, body mass, development time, and gene expression profiles. We explored the relationships among various life-history traits, identified the pathways that diverged significantly across these species, and observed the role of stabilizing selection in gene expression variation. We also identified the genes and pathways with significant positive or negative correlation to longevity (false discovery rate (FDR) < 0.05), after taking into account the influence of phylogeny and body mass differences.
Finally, we analyzed our list of genes against previously published lifespan-extension data and experimentally tested a number of candidates for longevity effects, offering various insights into regulation of longevity.

| Life-history traits in Drosophila
The 14 species surveyed in this study fall into two subgenera (Drosophila and Sophophora; Figure 1a and Table S1). Subgenus Drosophila is represented by D. virilis and D. mojavensis (together forming the virilis-repleta radiation; Markow & O'Grady, 2007), while subgenus  Figure S1 and Table S1). The mean lifespans of the remaining species ranged between 20 and 40 days. They are shorter than the lifespans of flies kept under 20°C (Schnebel & Grossfield, 1983; Table S1). The more rapid aging in flies under higher temperatures was also observed previously (Miquel, Lundgren, Bensch & Atlan, 1976). Within each species, the female flies were generally larger in size than the male flies (Table S1). Furthermore, we observed positive correlation between their body weights and median lifespans on log-scale (p-value = 4.66 9 10 À3 for male and 1.17 9 10 À2 for female), although the relationship was partly driven by D. virilis (excluding D. virilis: p-value = 8.01 9 10 À3 for male and 0.22 for female). The long-lived D. virilis also had the longest developmental time (18 days), whereas the other species took~10-12 days (Table S1).

| Gene expression variation reflects evolutionary relationships
To measure gene expression, we collected young adult male flies of each species for whole-body RNA extraction and sequencing. After normalization by library sizes and removal of genes expressed at low levels (Section 4), log-RPKM values were calculated for 6,510 gene ortholog sets. Overall, the expression profiles were similar to one another, with Spearman correlation coefficients of species pairs ranging between 0.68 and 0.90. Our expression data were also consistent with the previous studies of gene expression across multiple Drosophila species (Chen et al., 2014;Zhang, Sturgill, Parisi, Kumar & Oliver, 2007; Figure S2a).
To determine whether the evolutionary relationship of these species was reflected in their expression variation, we constructed gene expression phylograms using a distance matrix of 1 minus Spearman correlation coefficients and the neighbor-joining method (Brawand et al., 2011;Clark et al., 2007). The resulting topology was largely consistent with their phylogeny (Figure 2a). For example, there was a clear separation between subgenera Drosophila and Sophophora; all nine species of the group melanogaster fell under a single clade; and the two species of the saltans group clustered with each other. Most of the nodes received strong support in 1,000-time bootstrapping.
However, D. biarmipes was placed within the subgroup melanogaster, which might reflect divergence between genetic distance and expression distance. Other methods also produced very similar results ( Figure S2b-d).
Principal component analysis also showed that the species clustered according to their lineages (Figure 2b), with the first three principal components (PCs) accounting for~45% of the total variance. To understand the basis of the segregation pattern, we performed pathway enrichment analysis using DAVID ) on the genes with top 5% contribution to each PC (~300 genes each; Table S2, Figure S2e). In PC1, the top three functional clusters were ribosome/ribosomal proteins; ATP/ nucleotide binding; and mitochondrial electron transport chain and oxidative phosphorylation (Table S2A). Separation along PC2 was largely due to differences in transcription regulation and transcription factor binding (Table S2B), whereas PC3 was related to glycine, serine, and threonine metabolism (Table S2C). In other words, these are the processes that diverge most significantly across the 14 species (without explicitly considering their lifespan differences) and may account for their phenotypic diversities. It should be noted, however, that the underlying phylogenetic structure of the data may render the PCA biased. There is currently no consensus for a multivariate phylogenetic comparative method (Uyeda, Caetano & Pennell, 2015), so the results need to be interpreted with caution.

| Expression variation is best described by stabilizing selection
In the absence of selective pressure, the variation between a pair of species is expected to increase linearly with divergence time and can be modeled by a Brownian motion (BM) process (Felsenstein, 1985); this has been observed for transcription of many genes in mammals (especially among primates; Brawand et al., 2011;Khaitovich et al., 2004). On the other hand, previous studies based on genomes and transcriptomes observed that a large fraction of the genes in Drosophila were likely subjected to stabilizing selection (Bedford & Hartl, 2009;Clark et al., 2007;Kalinka et al., 2010;Rifkin, Kim & White, 2003), such that the increase in gene expression divergence eventually reaches a plateau (Bedford & Hartl, 2009) and may be better described by Ornstein-Uhlenbeck (OU) process (Butler & King, 2004;Martins & Hansen, 1997).
In agreement, we observed a nonlinear relationship with a plateau phase when plotting the average expression variances of the   (Table S1). Error bars indicate 95% confidence intervals (C.I.) by Kaplan-Meier method 63 Mya, Gao, Hu, Toda, Katoh & Tamura, 2011;and 25-40 Mya, Obbard et al., 2012;40 Mya, Russo, Takezaki & Nei, 1995). However, the same plateau feature could still be observed using these other estimates, suggesting it was underpinned mainly by the expression variances among the species pairs (i.e. the vertical axis of Figure 2c). For comparison, a similar analysis was performed using the brain and liver data of nine mammalian species (Brawand et al., 2011), but the plateau feature was not as strong (Figure S3b,c). Since Drosophila diverged more recently than the mammals, it was surprising to see the plateau feature in the former but not the later. Indeed, when amino acid substitutions per site between species pairs were plotted against divergence time, Drosophila produced a much steeper slope than the mammals (Figure 2d). It was previously reported that the substitution rate at silent sites in nuclear genes in Drosophila was about three times higher than that in mammals (Sharp & Li, 1989). This likely reflects the notion that the evolutionary divergence covered by the genus Drosophila equals or exceeds that of the entire mammalian radiation, probably due to the short generation time of fruit flies (Clark et al., 2007;Stark et al., 2007).
To examine the evolutionary models at the individual gene level, phylogenetic signals were measured using two metrics, Pagel's k (Pagel, 1999) and Blomberg's K (Blomberg, Garland & Ives, 2003).
These metrics are usually high for genes that follow BM model, but can be weakened by processes such as stabilizing selection. We found that the phylogenetic signals were low for many genes ( In addition, when we compared the goodness of fit of individual gene expression under BM model against OU models with up to three optima (Butler & King, 2004;Kalinka et al., 2010), we found over 85% of the genes fitted better with one of the OU models than with the BM model ( Figure S3e,f), similar to the percentage previously observed (Kalinka et al., 2010). Together, these data suggest that stabilizing selection likely plays an important role in influencing the gene expression patterns in Drosophila.
F I G U R E 2 Gene expression reflects evolutionary relationships. (a) Phylogram constructed based on gene expression. Drosophila virilis was used as outgroup. Reliability of the branching pattern was assessed by 1,000-time bootstrap across the genes (bootstrap values next to the nodes; green: ≥0.9; yellow: 0.6-0.9). (b) Principal component analysis. Proportion of variance explained by each principal component (PC) is indicated in parentheses. (c) Gene expression divergence reaches a plateau. Each triangle represents a pair of species. The red curve represents the best-fit line based on the model previously described (Bedford & Hartl, 2009), with the following parameters: selection parameter a = 0.0673 (95% C.I.: 0.0639-0.0721); drift parameter r 2 = 0.142 (95% C.I.: 0.136-0.152; Section 4). Orange curves represent the best-fit lines when each individual species was removed, one at a time (a ranged between 0.0530 and 0.0867). (d) Amino acid substitutions per site increase faster in Drosophila than in mammals. Amino acid substitutions per site between species pairs were calculated based on concatenated, gap-free alignment of orthologs

| Gene expression correlating with longevity
To identify the genes that correlate with species longevity, the phylogenetic generalized least-squares approach was employed to adjust for the evolutionary relationship (Felsenstein, 1985;Freckleton, Harvey & Pagel, 2002;Grafen, 1989;Martins & Garland, 1991). Regression was performed between expression values and male median lifespan ("ML"), different models of trait evolution were tested, and the best-fit model was then selected based on maximal likelihood (Table S3). Given that D. virilis was much larger in size and longerlived than the other species, we also performed regression after excluding the D. virilis data (Section 4). We found 384 out of the 6,510 genes with significant regression slope at p-value < .05, 93 of which with p-value < .01. Even after excluding the data point outlier representing longest living species (D. virilis),~60% of the genes remained statistically significant (i.e., p-value < .05; Table S3). However, after adjusting for multiple testing, none of these correlations reached statistical significance (median FDR = 0.79; further discussed below).
Pathway enrichment analysis was performed using DAVID separately for those with positive and negative correlation. Among those with positive correlation with ML, the top terms were related to lipid synthesis and metabolism, including "oxidation-reduction process," "lipid particle," "biosynthesis of antibiotics," "peroxisome" (these four pathways all had FDR < 0.05); and to a weaker extent, "fatty acid biosynthetic process" and "fatty acid metabolism" (Table S3). For negative correlation with ML, the enriched terms included "plasma membrane," "adult behavior," "protein tyrosine kinase activity," "visual behavior," and "neurogenesis," although they did not pass FDR < 0.05 (Table S3).
In addition, we visualized protein-protein interactions among our top hits using STRING version 10.5 (Szklarczyk et al., 2017) and found the network was significantly enriched in interactions (p-value = 9.75 9 10 À4 ; Figures 3 and S4). We also generated 1,000 sets of random genes (each set consisting of the same number of genes as our top hits) and confirmed that the protein-protein interactions in our top hits network had FDR < 0.001.
Among the genes positively correlating with lifespan, those found in lipid metabolic processes or lipid particles clustered together ( Figure S4 for detailed labeling overexpressing genes involved in beta-oxidation were longer-lived and more resistant to oxidative stress induced by paraquat treatment (Lee, Lee, Paik & Min, 2012), while knockout of Thiolase significantly shortened lifespan (Kishita, Tsuda & Aigaki, 2012).
In addition, some of the top genes were previously found to influence longevity in flies. For example, cyclin-dependent kinase Cdk5 requires an activating subunit (p35) for its full biological function, and flies with p35 mutation had significantly shortened lifespan and age-dependent loss of motor function (Connell-Crowley, Vo, Luke & Giniger, 2007). Another gene was Gclm, which codes for the modulatory subunit of glutamate-cysteine ligase (GCL), the rate-limiting enzyme in de novo glutathione biosynthesis. Global overexpression of GCLm in flies extended the mean lifespan by 24%, and neuronal overexpression of the catalytic subunit (GCLc) increased mean and maximum lifespans by up to 50% (Orr et al., 2005).
Many genes negatively correlating with lifespan were involved in nervous system development ( Figure 3). Among them were several cell surface receptors and signaling molecules, such as Fas2, which codes for a cell adhesion protein Fasciclin 2 that interacts with semaphorin (Smad) and connectin to regulate axon fasciculation (Yu, Huang & Kolodkin, 2000); babo, which codes for a type I activin receptor and regulates cell proliferation by stimulating Smad2-dependent pathways (Brummel et al., 1999); and shark, which codes for SH2 ankyrin repeat tyrosine-protein kinase and is required for dorsal closure during development (Fernandez et al., 2000). Also present were component of gap junction (ogre) and TGF-beta receptor (tkv). Downregulation of activin signaling by forkhead transcription factor (FOXO) in muscle tissues of flies has been shown to improve muscle performance, reduce secretion of insulin peptides from brain, and extend lifespan (Bai, Kang, Hernandez & Tatar, 2013). Flies with babo knockdown in muscle lived about 20% longer than wild-type, and pathway analysis of FOXO gene targets revealed enrichment of processes involved in postembryonic development, neuron differentiation, axonogenesis, and regulation of transcription and growth (Bai et al., 2013), similar to the terms we observed here (Table S3).
Another cluster included several genes affecting RNA polyadenylation (cleavage and polyadenylation specificity factor Cpsf160 and Cpsf73) and RNA splicing (e.g., CG6841, CG10333, CG6686, Ars2, CG7564, and l(2)35Df; St Pierre et al., 2014). Many of them were also involved in the nervous system development, as alternative splicing has very important roles in modulating neuronal maturation and functions (Li, Lee & Black, 2007).

| Experimental verification by in vivo RNAi knockdown
The Transgenic RNAi Project (TRiP) has generated and made publicly available a large number of transgenic RNAi fly stocks (Ni et al., 2008). In this resource, the transgenic shRNA is placed under UAS/ and Aut1 hairpin 34,359 increased lifespan in male flies (no effect on females; Figure 4, and Clk hairpin 31,661 in males) led to lifespan reduction, and the other cases showed no effect (Figure 4, Table S4). Differences in gene dosage effects together with gender-specific attributes may have influenced the outcomes. Overall, we observed no consistent effects of manipulation of candidate genes with longevity. The data appear to suggest that the global regulatory programs, rather than changes in the expression of individual genes representing the longevity signature, are responsible for lifespan determination. However, since the transcriptome data capture only a snapshot of the crossspecies gene expression variations which are also influenced by the evolutionary history and environmental factors, further experiments will be needed to validate our findings.

| CONCLUSION S
Fruit flies have contributed significantly to our understanding of genetics and developmental biology and remain a vital tool in the studies on aging and longevity. While most experiments in the aging field have been conducted using single species, we hypothesized that the comparative analysis of closely related species with varying natural lifespan may offer unbiased information on longevity mechanisms. By studying life-history traits and transcriptomes of 14 Drosophila species, we identified the pathways that diverge across these species and confirmed the role of stabilizing selection in influencing their expression patterns. By identifying the genes that correlate with longevity across these species, we found that longer-lived species upregulate genes involved in lipid metabolism and downregulate those involved in neuronal system development and activin signaling.
The dynamics of RNA polyadenylation and splicing also differed across the species.
We noted that none of the genes reached statistical significance after multiple testing correction in phylogenetic regression, suggesting that the expression patterns of many genes may not vary in a longevity-dependent manner. This is perhaps expected, as longevity is only one of the aspects underlying the gene expression differences across these species. The statistical and predictive power was also constrained by the relatively small number of species in the current study. Furthermore, since the transcriptomic data were based on the whole body instead of specific tissues, the signals might be diluted due to gene expression heterogeneity across different tissues. Nevertheless, several enriched pathways showed FDR < 0.05 (Table S3) and the top hits were significantly enriched in proteinprotein interaction (FDR < 0.001, Figure 3), suggesting that even though the genes may not reach statistical significance individually, they can still act together and produce significant changes in gene expression patterns.
For those genes with significant correlation to longevity, do they have the potential to influence lifespan individually or serve as biomarkers of longevity as a group? It appears that while some of these genes might have causative link to longevity, many others were likely mere "passengers" (their correlation might be the result,  (Table 1). However, on the individual gene level, we did not observe significant overlap with the genes previously shown to increase lifespan in other model organisms ( Table 2). The results of the in vivo shRNA knockdown analysis suggested many of the hairpins did not affect lifespan or only showed the effect in one gender. Gene dosage may have played a role too, for example, some genes previously reported to influence lifespan [e.g., Cdk5 in flies (Connell-Crowley et al., 2007) and Clock (ortholog of Clk) in mice (Dubrovsky, Samsa & Kondratov, 2010)] did not show the same effect upon shRNA knockdown. The longevity effects we observed for Gdi and rab5 were consistent in both genders and agree with the prior findings in worms (Samuelson, Carr & Ruvkun, 2007;Tacutu et al., 2012). Also, tkv and PH4al-phaEFB emerged as two candidates for lifespan extension in female flies ( Figure 4).
The lack of consistent effects of knockdown on lifespan may be due to a number of reasons. Since the cross-species gene expression is also influenced by evolutionary history and environmental factors, and our transcriptome data were collected from young healthy flies, the cross-species variation we observed may be confounded by differences in physiology, metabolism, and reproduction. As the lifespan of different Drosophila species may have different sensitivity to temperature, this may introduce further variation in the data. Indeed, several longevity studies on D. melanogaster at 25°C reported mean lifespans that were longer than we observed here: 42 days (Simon, Shih, Mack & Benzer, 2003); 55-65 days (Hercus, Loeschcke & Rattan, 2003); 59-60 days (Promislow & Haselkorn, 2002); 27-36 days on baker's yeast; and 71 days on brewer's yeast (Bass, Weinkove, Houthoofd, Gems & Partridge, 2007). This suggests that other factors such as diet might also have impacted our measured lifespans.
The transcriptome provides only a snapshot of the expression landscape during early adulthood, so it may not completely reflect the dynamics of gene expression across the full lifespan of the species. Furthermore, given the high expression heterogeneity across different tissues, our whole-body transcriptomics may be less powerful in identifying the true longevity determinants due to the weaken signals. Similarly, our whole-body knockdown strategy (instead of a tissue-specific knockdown) could have masked some of true effects of candidate genes, and our UAS/GAL4 RNAi system faces some technical drawbacks such as driver leakage, mosaicisms across organs and tissues, threshold effects of essential genes, and importantly, off-target activity. Further validation will be needed to substantiate our findings here. The data were compiled primarily based on GenAge and GenDR databases and were further supplemented with literature searches. The "matching direction" column indicates whether longevity effects are in the same or opposite direction of correlation in our top list.
Overall, the results suggest that the natural variation of lifespan across closely related species may share some signatures with lifespan modulation by dietary restriction or other interventions.
Although the individual candidate genes were impacted by low statistical power and relatively high numbers of false positives, they can act together to produce significant lifespan-associated expression patterns. It appears that lifespan regulation may depend more on the overall state of the cell as represented by its gene expression, rather than on perturbation of individual genes that correlate with lifespan.
Our data may serve as a starting point for further experimental analysis of gene expression states and longevity interventions that mimic natural changes in lifespan.

| Fly stocks and husbandry
The 14  For the lifespan assays, the newly emerged male and virgin female flies were collected within 18 h at 18°C (lower temperature to reduce first mating) and transferred to fresh corn meal food at density of 35 animals per vial for 2-3 days at 25°C. The flies were then allowed to mate for 1-2 days, collected using CO 2 , sorted by sex, and transferred to the specially designed cages in temperaturecontrolled incubator at 25°C ( Figure S1). The experimental flies were held on the designed diet and transferred to fresh food vials without anesthesia every 3 days. Dead flies were removed by aspiration and counted. Three biological replicate cages were used per gender per species, and the detailed life tables are shown in Table S1. Survival analyses were performed using R package "survival" and Kaplan-Meier method (Kaplan & Meier, 1958;Therneau, 2014).

| RNA sequencing
Since the gene expression patterns in female flies may have greater variation during egg-laying and reproduction, we selected male flies for RNA-seq analysis. Three-day-old male flies were placed in vials F I G U R E 4 Longevity effect of RNAi knockdown of selected genes. Each gene was targeted by one or more hairpins (identified by the stock numbers of the Transgenic RNAi Project in the parentheses). Black dots and red dots refer to the mean lifespan of flies treated with vehicle or with RU486 (inducing shRNA), respectively. The error bars denote standard error. Those with significant lifespan effects (G-rho family tests p-value < .01 and hazard ratio > 1.5) are boxed in green (if the longevity effect was the expected direction) or in blue (if the longevity effect was the opposite direction) and their p-values indicated. See Table S4 for more details on the corn meal diet for 12 days, with three replica vials for each species. Fresh food was supplied every 3 days, and dead flies were removed by aspiration. After 12 days, the flies were subject to total RNA extraction.

| Ortholog set identification
The ortholog sets across the species were identified by reciprocal best hits in BLAST. Briefly, we downloaded the genomes and annotation files for the species from Ensembl and NCBI (Table S1) and extracted all the coding sequences ("Species CDS"). The genomes of D. austrosaltans and D. saltans were based on our unpublished data.
As reference, we extracted the longest open-reading frame for each gene in D. melanogaster ("Dmel ORF"), after excluding those with multiple paralogs (i.e., genes with over 80% identity over 70% of length) or highly repetitive sequences. Mega BLAST (Morgulis et al., 2008) was performed to obtain reciprocal best hits between Dmel ORF and Species CDS. An ortholog set was declared if the orthologs were present in all 14 species. We confirmed our list had over 90% overlap with the curated ortholog list on Flybase (St Pierre et al., 2014; which covered nine of our species). To improve the quality of ortholog sequences, Trinity (Grabherr et al., 2011) was also used to de novo assemble the transcriptomes from the RNA-seq data (Table S1). Poorly annotated Species CDS (e.g., those without proper start or stop codons) were replaced by Trinity (Grabherr et al., 2011) transcripts where applicable, and we ensured that the final list of orthologs contained at least 20% conserved blocks in the multiple sequence alignment. The final ortholog sequences were mapped back to their respective genomes with GMAP (Wu & Watanabe, 2005) to generate customized GFF (General Feature Format) files.

| Data processing
RNA-seq reads alignment was performed using TopHat (Trapnell, Pachter & Salzberg, 2009), and read counting was performed using featureCounts (Liao, Smyth & Shi, 2014). Those ortholog sets with low expression (i.e., <3 counts in three or more species) were removed, and the counts were normalized by total library sizes with We compared our expression data with two previous studies of gene expression across multiple Drosophila species (Chen et al., 2014;Zhang et al., 2007)-GSE99574 is based on RNA-seq and GSE6640 is based on microarrays (herein referred to as "reference studies"). For simplicity, we used only "male replicate 1" in each ref- erence study for whole-body expression. The following steps were applied to each reference study: (i) extract the species and the gene orthologs common to our study and the reference study; (ii) quantile normalization; (iii) for each gene, rank the expression across the samples within our study and within the reference study; (iv) compute a pairwise distance matrix of average absolute rank differences for all sample pairs; (v) perform hierarchical clustering using average linkage.

| Divergence time and phylogram
The Drosophila species divergence time was estimated based on previous estimates (Russo et al., 2013) and the ortholog amino acid sequences. Briefly, the ortholog sequences were aligned with Clustal Omega v1.2.0 (Sievers et al., 2011) and concatenated gapfree with Gblocks v0.91 (Castresana, 2000). The tree was constructed using neighbor-joining method (Saitou & Nei, 1987) in Mega 6.06 (Tamura, Stecher, Peterson, Filipski & Kumar, 2013) and calibrated using the estimates of divergence time in the literature. The Drosophila expression phylogram was based on a distance matrix of 1 minus Spearman correlation coefficient and constructed by neighbor-joining method using R package "ape" (Paradis, Claude & Strimmer, 2004). Reliability of the branching pattern was assessed by a 1,000-time bootstrap across the genes, using "boot.phylo" function of R package "ape" and random sampling of the expression set with replacement. For the mammalian dataset (Brawand et al., 2011), the amino acid sequence alignments of eight mammalian species (human, gorilla, chimpanzee, orangutan, macaque, mouse, opossum, and platypus) were extracted from the 46-way multiple alignment in UCSC genome browser (Kuhn, Haussler & Kent, 2013). The species divergence time was based on TimeTree database (Hedges, Dudley & Kumar, 2006).

| Expression divergence
The expression divergence was measured as average expression variance in the standardized expression values across all the ortholog sets between the species pairs. The points were fitted by the model

| Mode of evolution of individual genes
The phylogenetic signals (Pagel's k and Blomberg's K) were calculated using R package "phytools" (Revell, 2012). To model the gene evolution, individual genes were fitted to a BM model (null hypothesis) and an OU model with one to three optima (alternative hypothesis), using R package "ouch" (King & Butler, 2009
Briefly, the null model assumes 0 covariance between the species (i.e., independent evolutionary trajectory). The BM model assumes the amount of changes in a trait is proportional to time, and the residual covariance between a pair of species is proportional to the shared branch lengths (i.e., shared evolutionary trajectory). Both the Lambda and OU models are variations of the BM model: The Lambda model scales the covariance by a factor lambda ranging between 0 and 1, where the OU model includes a selection strength parameter and an optimal trait value, such that the trait values of the species will be pulled toward the optimal trait value. For the Lambda and OU models, the parameters were estimated simultaneously with the coefficients using maximum likelihood. The best-fit model was selected based on maximum likelihood. The strength of correlation was based on the p-value of regression slope.

| Comparison with GenAge/GenDR database and microarray data
Our gene list was examined against GenAge/GenDR databases (de Magalhães et al., 2009;Plank et al., 2012) to determine how the longevity effects reported in model organisms relate to our gene dataset. To analyze microarray datasets, data were downloaded from Gene Expression Omnibus (GEO) database. We used the search term "Drosophila melanogaster AND dietary restriction" and study type "Expression profiling by array" to identify nine datasets initially. Among them, we excluded three time-course studies, one study focusing on expression in fat body, and one study where the RNA was preprocessed by sucrose gradient separation. For the remaining datasets (GSE37537, GSE26724, GSE48145, GSE26726), relevant comparisons of treatment versus control were selected and differentially expressed (DE) genes were identified using R package "limma" (Smyth, 2005). These DE genes were then compared with our list of top hits to determine if the direction of correlation was consistent.
Two methods were employed to calculate p-values. The first relied on binomial distribution, counting the number of match (i.e. same direction of correlation) and the number of mismatch (i.e., opposite direction of correlation), assuming equal probability of obtaining a match and a mismatch by chance. The second method relied on simulation, where the direction of correlation in our top list was sampled without replacement and compared with microarray experiments to calculate p-values (by binomial distribution); this procedure was repeated 1,000 times to generate an empirical distribution. The original p-value was then compared to the empirical distribution. Both methods produced very similar results.

S U P P O R T I N G I N F O R M A T I O N
Additional Supporting Information may be found online in the supporting information tab for this article.