Great Lakes Institute for Environmental Research, University of Windsor, Windsor, ON, Canada
Department of Biological Sciences, University of Windsor, Windsor, ON, Canada
Correspondence: Daniel D. Heath, Great Lakes Institute for Environmental Research, University of Windsor, 401 Sunset Ave., Windsor, ON N9B 3P4, Canada. Tel.: +1 519 253 3000 ext. 3762; fax: +1 519 971 3613; e-mail: firstname.lastname@example.org
Fine-scale population structure has been widely described for salmonid populations using neutral genetic markers, but whether that structure reflects adaptive differences among the populations remains of interest to evolutionary biologists and conservation managers alike. The use of transcriptomics to quantify population differences in genetically controlled functional gene expression traits holds promise for investigating this divergence associated with possible local adaptation. We use custom microarrays to characterize population divergence in transcription at functionally relevant (metabolic and immune function) genes among tributary populations of rainbow trout from Babine Lake, BC and compare it to neutral divergence estimated from microsatellite markers. Transcriptional divergence (PST) was determined at resting state and in response to metabolic and immune challenges, two major sources of mortality and thus selective forces on juvenile salmonids. Results indicate that the majority of selected genes [56 genes (65%), 64 genes (63%) and 38 genes (78%) under control, temperature and immune challenges respectively] show transcriptional divergence (PST > FST) that is consistent with the action of divergent selection. Patterns of pairwise PST among populations are inconsistent with evolution by drift. In general, it appears that the magnitude and pattern of population divergence in transcription reflect the action of natural selection and identify selection on transcription as a mechanism for local adaptation. These results reinforce the need to conserve salmonids on a tributary basis and provide insight into genetic mechanisms that facilitate local adaptation.
Genetic population structure arises when gene flow is reduced and populations begin to evolve independent of one another. The two major forces that drive divergence among populations are genetic drift and natural selection. Neutral markers (e.g. microsatellite DNA) have been used to define population structure, and thus evolutionarily significant units, for species from a broad taxonomic range (e.g. plants – Bottin et al., 2007; Salas-Leiva et al., 2009; invertebrates – Sekino & Hara, 2007; amphibians – Chiari et al., 2006; reptiles – Mockford et al., 2007; fish – Beacham et al., 2004; Winans et al., 2004). The goal of such studies is to infer adaptive genetic variation within and among populations and advocate conservation actions based on these genetic groupings (Fraser & Bernatchez, 2001). The development of highly variable markers and an increase in throughput ability has allowed population structure to be defined at increasingly fine scales (e.g. Pearse et al., 2007; Narum et al., 2008; Wellband et al., 2012); however, the degree to which population structure reflects adaptive differences among populations remains largely unaddressed. Indeed, the existence of population structure may allow, or reflect, the evolution of local adaptations by natural selection, but it does not indicate that it has occurred (Kawecki & Ebert, 2004). A critical understanding of the processes governing population divergence requires knowledge of the role selection plays in determining divergence among populations at functional loci.
Pacific salmonids exhibit an astonishing degree of variation for many life-history traits (Groot & Margolis, 1991; Quinn, 2005), which makes them an ideal system for exploring the evolutionary basis of population divergence. This life-history variation is believed to be the result of selection for traits that maximize fitness for individuals spawning in specific rivers or locations within rivers. This process, known as local adaptation, is often invoked to explain differences in life-history traits, for example timing of spawning runs, juvenile rearing strategy, morphology and developmental rates (reviewed by Taylor, 1991; Garcia de Leaniz et al., 2007). Despite their long distance migrations, adult salmon have specific homing behaviours (Quinn, 2005) that facilitate the formation of local populations and limits gene flow among populations. Population structure within a species across heterogeneous environments provides some of the essential conditions necessary for local adaptations to develop (Kawecki & Ebert, 2004). The evolution of structured salmonid populations results from a complex interaction of gene flow (low rates of straying by spawning adults), genetic drift (small effective population sizes) and forces of selection (heterogeneous environments across species' ranges). Local adaptation appears to be context, trait and population-specific (Fraser et al., 2011), and thus, salmon are an ideal study system for investigating the relative roles of drift and selection in the determination of population structure.
The comparison of variation within and among populations using neutral traits (drift) vs. functional traits (drift and selection) is a powerful approach for providing evidence of local adaptation (Whitlock, 2008) as well as quantifying the relative roles of evolutionary forces in population divergence. Divergence estimates based on additive genetic variance (QST) or total phenotypic variance (PST) for functional traits can be directly compared with neutral genetic divergence (FST) and inferences can be made about the magnitude and mode of selection affecting those traits (Whitlock, 2008). In salmonids for example, comparisons of QST and FST have been used to infer diversifying selection acting on growth-related traits in Coho salmon, Oncorhynchus kisutch (McClelland & Naish, 2007) and growth and survival traits in grayling, Thymallus thymallus (Koskinen et al., 2002). A promising new approach to the study of functional population divergence is the use of transcriptomics. Population-specific gene transcription profiles have been demonstrated for a variety of fish species including Atlantic salmon, Salmo salar (Tymchuk et al., 2010) and killifish, Fundulus heteroclitus (Whitehead & Crawford, 2006). QST has also been used on gene transcription data to examine functional divergence among populations of Atlantic salmon, S. salar (Roberge et al., 2007) and steelhead trout, Oncorhynchus mykiss (Aykanat et al., 2011). This approach has great potential for dissecting the relative contribution of drift and selection to population divergence for known-function gene loci.
Gene transcription at candidate loci has been implicated as a mechanism facilitating the local adaptation of rainbow trout populations from Babine Lake (Wellband & Heath, 2013). Here, we present the results of a study testing the relative contributions of drift and selection to transcriptional divergence among juvenile rainbow trout populations from Babine Lake tributaries. We use custom microarrays to assay transcription at functionally relevant (metabolic and immune) genes and compare the levels of transcriptional divergence to neutral genetic divergence estimated using microsatellite markers. We show that despite the influence of drift on population divergence at functional traits, selection also plays an important role in explaining population divergence. This study thus provides evidence for a substantial role of transcriptional variation in the evolution of local adaptation in salmonids.
Materials and methods
Sampling sites and protocol
Freshwater-resident rainbow trout (O. mykiss) spend their life as adults in the body of Babine Lake, British Columbia but return to spawn in the lake's tributary streams and rivers during the spring. As a result, populations of rainbow trout are genetically structured based on microsatellite population genetic analyses, indicating reduced gene flow amongst otherwise geographically proximate and physically connected populations (global FST: 0.053, pairwise range: 0–0.103, Koehler, 2010). The physical attributes of tributary streams to Babine Lake vary from large systems that are stable year-round to small dynamic systems that experience high flows during spring freshets and low or negligible flows during late summer and fall (Bustard, 1989). Juvenile rainbow trout spend up to 3 years rearing in these tributary environments before descending to the lake (Bustard, 1989). During this time, juvenile trout experience high levels of mortality consistent with other salmonids (> 90%, Quinn, 2005). Given that populations experience reduced gene flow, local environmental conditions vary and juveniles experience high mortality, it is reasonable to predict that some level of local adaptation is present among Babine tributary populations of rainbow trout. We sampled six tributaries of Babine Lake (Fig. 1) known to have rainbow trout spawning populations (Bustard, 1989). Tributaries were chosen to represent the geographic extent of resident rainbow trout producing tributaries, a variety of stream environmental conditions (Maximum stream temperature: 10.5–14.5 °C, bacterial diversity: Shannon index = 1.78–3.28; Wellband & Heath, 2013), as well as a range of neutral genetic distances among rainbow trout populations (pairwise FST: 0.026–0.094, Koehler, 2010). Approximately 50 young-of-the-year rainbow trout were collected between 23 August 2010 and 25 August 2010 from each tributary by dip netting and electroshocking (Smith-Root BP-15 backpack shocker, Vancouver, WA, USA). Fish were transferred to the Department of Fisheries and Oceans' Fulton River Spawning Channel facility (2–6 h travel time) on ice in heavy plastic bags (60 × 120 cm) containing ambient water charged with oxygen from the collection tributary. Tributary populations were held in separate cages in common conditions in a 3 m round tank with water flow-through from Fulton Lake (15 ± 0.5 °C). Fish were held for a minimum of 5 days under starvation to acclimate to captive conditions and recover from any stress related to capture and transportation. Several fish from Tachek Creek died after transfer; however, the presence of many dead fish at the Tachek Creek sampling location indicated that this population was likely acutely stressed (likely due to low water flows) at the time of sampling.
We subjected a subset of fish from each population to: (i) a temperature stress challenge and (ii) an immune challenge (Tachek Creek was excluded from the immune challenge due to insufficient samples). These challenges allowed us to evaluate the generality of transcriptional divergence among populations experiencing different physiological states. Pathogen-mediated mortality is known to be an important selective force on young-of-the-year salmonids (e.g. de Eyto et al., 2011) and temperature is a major stressor for juvenile fish and has been shown to drive selection on early life-history traits in other salmonids (Hendry et al., 1998; Haugen & Vøllestad, 2000; Jensen et al., 2008). The immune challenge utilized a vaccine for a common salmonid pathogen from marine systems (Vibrio spp.). This pathogen was chosen to be novel to the freshwater-resident populations we studied, eliminating potential differences due to previous exposure (adaptive immune response). The use of this vaccine to elicit a transcriptional response in Oncorhynchus spp. has previously been demonstrated (Aykanat et al., 2012b). Briefly, the immune challenge consisted of a 1-min incubation in a 10% Vibrogen 2 vaccine bath containing formalin-inactivated cultures of Vibrio anguillarum serotypes I and II and Vibrio ordalii (Novartis Animal Health, Mississauga, ON, Canada). The temperature challenge consisted of holding the fish for 1 h in a water bath raised 5 °C above ambient (ambient = 20 ± 0.5 °C). The specific temperature challenge was chosen to be high enough to cause metabolic stress, but not to exceed the thermal maximum for rainbow trout. The temperature challenge used here represents a realistic temperature challenge that is experienced by at least some of the populations studied (Wellband & Heath, 2013). Fry were returned to a holding cage following the challenge and allowed to recover for 24 h before sampling of tissues occurred. Control fish (t = 0) were sampled directly from the holding tanks prior to the challenge. These individuals allow us to assess among-population differences in resting state in addition to challenge-induced transcriptional differences. Although our challenged fish experienced handling stress in addition to the challenge, we used a gentle handling procedure not believed to effect gene transcription 24 h after handling. An overdose solution of clove oil (250 ppm) was used to humanely euthanize all fish. Gill tissues were dissected immediately and preserved in RNA later (Invitrogen, Burlington, ON, Canada) at 4 °C. Samples were frozen at −20 °C within 5 days and stored at that temperature until further analysis. Caudal fin clips were also taken and preserved in 95% ethanol for genotype analysis.
Microsatellite genotype analysis
DNA was extracted from fin clips for 30 individuals from each tributary population using a column-based extraction protocol (Elphinstone et al., 2003). Individuals were genotyped at eight microsatellite loci (Table 1). Polymerase chain reaction was used to amplify microsatellites in a 12.5 μL reaction containing 1× reaction buffer (10 mm Tris–HCl (pH 8.3), 50 mm KCl), 2 mm each dNTP, 800 μm dye-labelled forward primer and 2 μm reverse primer, 0.125 U AmpliTaq DNA polymerase (Applied Biosystems, Burlington, ON, Canada) and 0.5 μL of DNA. PCR fragments were analysed using a Li-Cor 4300 DNA analyser (Li-Cor Biosciences, Lincoln, NE, USA) and alleles scored with Gene ImagIR software (Scanalytics Inc., Fairfax, VA, USA). Departures from Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium were tested using 10 000 permutations in GenePop 4.0 (Raymond & Rousset, 1995; Rousset, 2008), and significance was assessed using sequential Bonferroni correction (Rice, 1989). Weir & Cockerham's (1984) unbiased estimator of FST was used to assess population differentiation in msa 4.0 (Dieringer & Schlötterer, 2003), and significance was tested by bootstrapping 10 000 replicates. The distribution of FST values has been shown to approximate a chi-squared distribution with (npopulations − 1) degrees of freedom (Lewontin & Krakauer, 1973), and this assumption has been shown to be robust for a wide variety of population structure models (Beaumont & Nichols, 1996). The analysis of transcriptional differences for multiple genes (as in our microarray experiment) and comparison to FST can cause an increased risk of falsely identifying genes under selection if we evaluate significant differences at a 95% confidence interval for FST. To be conservative, we calculated a range of confidence intervals (95%, 99.9% and 99.95%) for FST using the program Fdist2 (Beaumont & Nichols, 1996) and the average heterozygosity for all loci. Finally, we tested for isolation by distance using a Mantel test to correlate linearized pairwise genetic distance (FST/1−FST) and linear water distance between sites measured using digital 1 : 250 000 scale topographic maps.
Table 1. Microsatellite marker loci used to determine neutral genetic population structure among six Babine Lake tributary populations of rainbow trout. FST, Weir & Cockerham (1984) θ; Tm, annealing temperature for PCRs.
We used a functionally annotated custom probe set (three hundred and sixty-seven 45 bp probes) developed for use in salmonids (http://www.uwindsor.ca/glier/system/files/ChinookMicroarray.txt). The genes in this probe set were chosen to represent major metabolic pathways, genes involved in both innate and adaptive immune responses, xenobiotic processing as well as cell structure and genes widely used as endogenous controls in quantitative PCR studies. Probes were spotted onto poly-l-lysine-coated slides (Thermo Scientific, Waltham, MA, USA) using a SpotArray 24 microarray printing system (Perkin Elmer, Waltham, MA, USA) equipped with Stealth Micro spotting pins (ArrayIt, Sunnyvale, CA, USA). Gene probes were printed in triplicate within each array, and the array itself was replicated three times per slide. Following printing, probes were cross-linked to the slide with UV light and blocked with succinate anhydride following Massimi et al. (2002).
RNA extraction and cDNA synthesis
RNA was extracted from one whole gill arch for four individuals per population per treatment. We chose gill tissue due to its metabolic importance as the primary site of gas and ion exchange as well as experiencing direct exposure to the environment and its important role in both adaptive and innate immune responses (e.g. Olsen et al., 2011). Gill tissue was homogenized in 0.8 mL of TRIZOL (Invitrogen) and total RNA extracted following the method of Chomczynski & Sacchi (1987). Quality of total RNA was assessed by the presence of clear 28S and 18S rRNA bands using gel-electrophoresis. Total RNA was assessed for purity and concentration using UV spectrophotometry in a Victor 3V plate reader (Perkin Elmer). All total RNA preparations had purity values of 1.9–2.1 (A260/A280). Total RNA (10 μg) was reverse transcribed into complementary DNA (cDNA) using anchored oligo dT primers (2.5 μg; Invitrogen) in a reaction containing 1× RT buffer [50 mm Tris–HCl (pH 8.3), 75 mm KCl, 3 mm MgCl2], 5 mm DTT, 400 U of Superscript III (Invitrogen), 40 U RNaseOUT (Invitrogen) and dNTPs including amino-allyl and amino-hexyl modified nucleotides (Invitrogen). Reactions were incubated at 46 °C for 3 h and were terminated by adding 1 μL 0.5 m EDTA and heating at 95 °C for 3 min. RNA was degraded by adding 15 μL of 1 m NaOH and heating at 70 °C for 10 min and neutralized with 15 μL of 1 m HCl. cDNA was precipitated overnight in a solution of 0.3 m sodium acetate and 75% ethanol at −20 °C.
cDNA labelling and microarray hybridization
Amino-allyl and amino-hexyl modified cDNA was labelled with either Alexafluor 555 or 647 (30 μg; Invitrogen) in a freshly prepared coupling buffer (0.3 m NaHCO3, pH 9.0) in the dark for 2 h. Dyes were assigned randomly among individuals but at equal proportions within populations and treatment groups. Labelled cDNA was purified using PureLink PCR clean-up system (Invitrogen) following the manufacturer's directions except for elution twice into 40 μL of 10 mm KPO4 (pH 8.5). Labelled cDNA was hybridized to our custom low-density salmonid microarray in a 2× buffer [25% HiDi formamide (Applied Biosystems), 4× SSPE, 0.1% SDS, 4× Denhardt's solution] in the dark for 16 h at 42 °C. Slides were washed once in 2× SSC/0.1% SDS at 42 °C for 3 min followed by one 2 min wash in each of 2× SSC/0.1% SDS, 1× SSC and 0.1× SSC at room temperature. Slides were scanned immediately using a ScanArray Express scanner (Perkin Elmer) and quantified using ScanArray Express Microarray Analysis System software version 4.0 (Perkin Elmer).
Microarray data analysis
We analysed the microarray data as a one-colour (channel) experiment. Individual spots were first excluded if their signal to noise ratio was < 2; these spots were not used in the following corrections and normalizations of the data. Spots were background corrected using the ‘normexp’ algorithm with an offset of 50 following Ritchie et al. (2007) in the limma package (Smyth, 2005) in the statistical software r (R Development Core Team, 2011). Between-array normalization was performed using quantile normalization (Smyth & Speed, 2003). We then excluded genes that did not possess at least two data points for an average of at least two individuals per population. This criterion allowed us to analyse 100 genes and 56 genes for response to temperature and immune challenges, respectively. To test for population-specific response to the challenges, we fit the following linear mixed model using maximum likelihood methods in the r package lme4 (Bates et al., 2011).
Where, xajkl is the normalized intensity over replicate spots in the kth block, nested within the jth individual as random effects and the ath population, the bth treatment and their interaction as fixed effects. Population-specific responses were determined by excluding the interaction term from a second model and testing for change in the model fit using a likelihood ratio test. Maximum likelihood estimation of parameters was used for this test because restricted maximum likelihood changes are not sensitive to changes in the structure of fixed effects. To account for multiple tests, we also calculated false discovery rates by randomly permuting the data 100 times and determining the number of genes with significant population-specific responses.
To estimate divergence among populations for transcription at the functional genes we assayed, we again quality filtered the data such that each treatment (environment) only included genes that had on average at least two individuals per population. Thus, we analysed 86, 101 and 49 genes for the control, temperature challenge and immune challenge, respectively (Table S1). We fit separate linear random-effect models (one for each treatment group) implemented in the r package lme4 (Bates et al., 2011). For each gene, we partitioned the variance observed in signal intensity using the following mixed model:
Here, xajkl is the normalized intensity over replicate spots in the kth block, nested within the jth individual, nested within the ath population all as random effects. These models were fitted using restricted maximum likelihood to provide an unbiased estimate of model parameters. We then used the estimated parameters from those models as priors to calculate the highest probability density (HPD) for the parameters using Markov chain Monte Carlo simulation (1000 reps) in the r package languageR (Baayen et al., 2008; Baayen, 2011). Median HPD values were used as parameter estimates for the variance components to calculate phenotypic (or functional) divergence estimates. Variance explained by the random population term was taken as the among-population variance component () whereas the residual variance was taken as the within-population variance component (). The measure QST, strictly speaking, implies the additive component of genetic variation among groups. In this study, we cannot separate possible environmental influences on phenotype; thus, we denote our phenotypic (transcriptional) divergence as ‘PST’, following Whitlock (2008). PST was calculated using the formula following Whitlock (2008). Loci were assigned to one of three groups: (i) those with PST less than the lower bound of the confidence interval for FST (indicative of stabilizing selection), (ii) those within the confidence interval for FST (indicative of neutral drift) and (iii) those that exceed the upper bound of the confidence interval for FST (indicative of divergent selection).
We tested for biases in patterns of divergent selection among functional groups of genes using gene ontology information for the annotated genes on our array. We first classified the biological function (metabolic function, immune response or other; Fig. 2) of each gene using the blast mapping and annotation functions in the software package Blast2GO (Conesa et al., 2005). We then used a Kruskal–Wallis test to test for biases in the rank position of PST divergence for metabolic and immune genes in each of the three challenges. If selection on metabolic (or immune) genes were driving the divergence among populations, then we would expect those functional groups to have higher PST values relative to the other functional groups. We also used Kruskal–Wallis tests to determine whether the genes demonstrating population-specific transcriptional response were more highly differentiated than expected by chance.
We tested for a pattern of isolation by distance among populations for transcriptional traits, under each treatment, by correlating pairwise PST values and geographic distance using Mantel tests. If gene transcription differences among populations are solely a result of genetic drift then we expect patterns of divergence to correlate with geographic distance because neutral genetic divergence (FST) follows this pattern for these populations (Koehler, 2010). In contrast, if transcriptional differences are a result of selection then it is unlikely that transcription would correlate with geographic distances. Pairwise PST values were calculated for each gene in each pairwise comparison of populations using the same linear model as previously used for PST calculations except each model only included the data for two populations at a time. Finally, we correlated pairwise PST values with pairwise FST values to test whether the magnitude of neutral divergence represented the magnitude of functional divergence.
Tests for departures from HWE detected 19 (40%) loci by population comparisons that deviated from expectations after Bonferroni correction. One population (Tachek Creek) had seven of eight loci that departed from HWE expectations that were responsible for many of the HWE departures. This was likely the result of nonrandom sampling as these individuals had to be netted from small pools in a nearly dry Tachek Creek. Otherwise, there were no loci in particular that accounted for departures from HWE expectations. Tests for linkage disequilibrium identified 10 (6%) locus pairs by population that showed evidence of linkage. All loci demonstrated highly significant (P < 0.001) population structure (FST) following permutation tests. Estimates for global FST ranged from 0.025 (Omy325) to 0.082 (OtsG243) with an overall average value of 0.051 (Table 1). The 95% confidence interval for mean FST was determined to be 0.017–0.080, the 99.9% interval was 0.006–0.126 and the 99.95% interval was 0.004–0.134. These additional confidence intervals were calculated to act as a Bonferroni correction for the number of genes for which we calculated PST (described in detail below). Values of PST subsequently determined to be outside this interval indicate divergence among populations that cannot be explained solely by genetic drift. The Mantel test indicated that population divergence at these microsatellite (neutral) loci follows a pattern of isolation by distance (R2 = 0.53, P = 0.039).
Population-specific responses were demonstrated for eight genes (14% of 56 genes) in response to immune challenge and 22 genes (22% of 100 genes) in response to temperature challenge (Fig. 3, Tables S2 and S3). False discovery rates were determined to be 0.037 for the immune challenge and 0.065 for the temperature challenge indicating our results do indeed reflect population-specific differences in gene transcription. Genes demonstrating significant population-specific responses to the immune and temperature challenges reflect genes that are involved in response to the specific challenges we conducted under the timeframe that we sampled. Nonresponsiveness of the remaining genes does not reflect their irrelevance to population divergence in transcription but rather that the signature of selection they demonstrate is related to either selective pressures we have not utilized in our study or differences in the timing of response from what we measured. As such, we report divergence estimates for all genes in each treatment that met our data quality criteria (see 'Materials and methods').
PST values for divergence among populations ranged from 0.029 to 0.30 for 101 genes among the temperature challenge groups, 0.037–0.29 for 49 genes among the immune challenge groups and 0.029–0.30 for 86 genes among the control groups (Fig. 4, Table S1). As a result, no genes in any treatment group appeared to be under stabilizing selection. For the temperature challenged and control group PST value comparisons with FST, we used a conservative threshold of 99.95% confidence interval of FST to assess which genes may be evolving by diversifying selection. For the immune challenged PST values (fewer genes and hence comparisons), we used a threshold of the 99.9% confidence interval of FST. These thresholds were chosen to account for the possibility of false positives in PST calculations and are analogous to Bonferroni corrections for the number of loci analysed. We found the divergence estimates to be consistent with expectations associated with divergent selection for 38 (78%), 64 (63%) and 56 (65%) genes in each of the immune challenged, temperature challenge and control states, respectively. This suggests that selection has contributed to transcriptional divergence among populations for these genes. The remaining 11, 37 and 30 genes analysed for each treatment, respectively, have divergence estimates that are consistent with evolution by drift. Among-population variation accounted for 16.5% (range: 3–30%) of the total transcriptional variance. Partitioning that variation showed that drift explained, on average, 6.4% (range: 3–14%) of total transcriptional variation whereas selection was more important explaining, on average, nearly twice as much (10.1%, range: 0–24%) of the transcriptional variation.
Results of the Kruskal–Wallis test indicated that neither metabolic genes (, P = 0.11) nor immune genes (, P = 0.34) were more highly divergent (i.e. higher rank of PST values) than random expectations in their respective challenges and not at all in the control treatment either. In contrast, when we investigated the rank distribution of just the genes demonstrating population-specific transcriptional responses, we found them to be more highly divergent than random expectations in response to the immune challenge (, P = 0.02) but not in response to the temperature challenge (, P = 0.28). The divergence in gene transcription among populations followed a pattern of isolation by distance for 2 (3.6%), 6 (8.3%) and 1 (2.3%) gene in the control, temperature challenge and immune challenge groups, respectively (Fig. 4). None of the pairwise divergence estimates for genes (PST) tested in any treatment showed significant correlations with FST.
We have demonstrated that population divergence in gene transcription is mediated by both drift and selection for Babine Lake tributary juvenile rainbow trout. Among-population variation accounted for 16.5% of the total transcriptional variance that we partitioned into drift (6.4%) and selection (10.1%). Phenotypic divergence (QST) has previously been used to identify gene transcription under selection (e.g. Roberge et al., 2007; Aykanat et al., 2011). Roberge et al. (2007) utilized a transcriptome scan approach where they identified genes potentially under directional selection as outliers in the upper 1.5% of the QST distribution. They identified 3% of genes putatively evolving by divergent selection. The low percentage of genes they detected reflects the authors' assumption that transcription of most genes is under stabilizing selection and may also reflect the lower level of neutral divergence among their populations (FST = 0.028–0.036; Roberge et al., 2007). When compared with our study, their QST estimates for all genes were, on average, much lower than those presented here; however, when considering only the genes identified in Roberge et al. (2007) as divergent, the range of QST values (0.07–0.19) was comparable to our PST values. Our study is unique in that we have simultaneously estimated functional (PST) divergence of transcriptional traits and neutral (FST) divergence of microsatellites among populations. QST or PST values for gene transcription that exceed the range of FST values for neutral loci indicate that transcription of these genes are more divergent than expected based on genetic drift alone (Whitlock, 2008). This implies that other evolutionary forces are driving the population divergence for transcriptional traits we observed and may reflect the development of local adaptation.
Population-specific transcription profiles have been demonstrated in comparisons among genetically divergent populations of wild Atlantic salmon (Tymchuk et al., 2010). Tymchuk et al. (2010) found that hierarchical clustering of transcription profiles reliably separated populations of Atlantic salmon from the Bay of Fundy and, similar to our results, reported no significant correlation between transcriptional divergence and neutral genetic divergence. The lack of correlation likely reflects the different evolutionary pressures and histories of transcription vs. neutral loci. The direction and strength of selection on transcriptional traits are likely to reflect the environmental conditions of habitats specific to each population (Giger et al., 2006). As such, the hierarchical clustering of transcription traits better reflects environmental differences among populations and thus should not necessarily vary with geographic distance as neutral loci often do. Our results showed that, for the majority of genes, transcription does not follow a pattern of isolation by distance. Because isolation by distance is predicated upon genetic drift – gene flow equilibrium, our isolation by distance analyses provide another avenue of support that the transcriptional divergence pattern is not likely to be primarily due to evolution by drift.
The importance of transcriptional adaptation to speciation processes is highlighted by adaptive differences in gene transcription among salmonid species pairs with different energetic requirements (Derome et al., 2006; St-Cyr et al., 2008). Those studies demonstrated a transcriptional trade-off model for survival and growth-related genes that explain life-history divergence among incipient species. Additional transcriptional adaptations among species utilizing benthic and limnetic niches have been identified using a candidate gene approach to test hypotheses about the energetic requirements for each life history (Jeukens et al., 2009) and evolutionary loss-of-function of transcription has been demonstrated in land-locked, formerly anadromous, steelhead (Aykanat et al., 2011). Those studies highlight the role selection can play in the evolution of transcriptional regulation of metabolic genes. Although our analysis did not identify functional groups of genes that were over- or under-represented in comparisons of drift vs. selection among populations, our array is enriched for genes associated with immune and metabolic functions that have previously been implicated in response to metabolic (e.g. Wiseman et al., 2007) and immune challenges (e.g. Raida & Buchmann, 2008). It is possible that we do not have sufficient power to detect enrichment of these groups of genes in our study, that is, we have a biased sample of genes that have a higher random likelihood of being under selection. We did, however, demonstrate that genes with a population-specific response to immune challenge were more highly differentiated than expected by chance. These genes may represent ideal candidate genes for a more detailed study of local adaptation and possible eventual speciation among the populations in this study.
A large portion (approximately 60%) of the genes we studied appeared to be influenced by divergent selection. Our inability to partition the observed transcriptional variance into genetic and environmental variance components leaves the possibility that previous environmental exposure differences among the sampled populations are influencing our estimates of transcriptional divergence. Environmental contributions to transcriptional variation of metabolic genes in wild caught F. heteroclitus have been shown to have little or no influence on differences among groups (Scott et al., 2009). To reduce the effects of previous environment in our study, we allowed the fish time (> 5 days) to acclimate to hatchery conditions, in addition we primarily assayed genes that are known to respond rapidly to stress and return to baseline levels within the time period allowed for acclimation. Multiple studies have demonstrated additive genetic variation for gene transcription (Brem & Kruglyak, 2005; Roberge et al., 2007; Aykanat et al., 2012b) suggesting the transcriptional differences we report represent primarily functional additive genetic differences among populations, indicative of local adaptation. Nevertheless, there have been a number of studies that have documented substantial nonadditive genetic variation contributing to transcriptional trait variation in salmonids (Roberge et al., 2008; Mavarez et al., 2009; Normandeau et al., 2009) and other species (reviewed by Gibson & Weir, 2005). Nonadditive genetic effects have the potential to be adaptive in salmonids (Aykanat et al., 2012a), so even if our phenotypic divergence estimates (PST) reflect nonadditive genetic variation, they may still represent adaptive differences among populations. It is also possible that we measured acclimation effects, but acclimation ability may be adaptive (e.g. Marais & Chown, 2008) for the local environments we studied, and the fact that we demonstrated differences for a large number of genes with a diversity of putative functions suggests that it is unlikely all of the differences observed are simply the result of acclimation effects. Ideally, future work would include a controlled breeding experiment among tributary populations to assess the relative contributions of additive genetic and environmental variation to observed transcriptional differences among populations.
The large discrepancy in sample size used for estimating PST and FST reflects the large number of individuals required to reliably estimate allele frequencies for highly polymorphic genetic markers such as the microsatellites used in this study (e.g. Hale et al., 2012). Despite this, we acknowledge that an ideal number of samples for accurately estimating the variance of a continuous trait, such as gene transcription, is larger than 4. To explore possible biases associated with the sample size discrepancy, we randomly sampled four individuals from each of our populations and calculated global FST for this reduced sample size. We replicated this analysis 1000 times, which resulted in a mean FST (±SD) of 0.049 (0.019) that is very similar to the global FST for the whole data set (mean = 0.051, 95% CI = 0.017–0.080) suggesting that the different sample sizes should not be biasing our results. As the cost of quantifying gene transcription is reduced, assaying larger sample sizes will become more economical and provide increased power for detecting population level differences in transcription. We believe that our approach presented here allows a broader assessment of the evolutionary processes governing gene transcription evolution, and aids in determining the relationship between neutral genetic population structure and variation in functional traits.
In summary, we demonstrated transcriptional divergence among tributary populations of juvenile rainbow trout in Babine Lake, despite no physical barrier to dispersal among the tributaries, and adults from all tributaries residing together in the lake. Comparisons of the patterns and magnitude of transcriptional divergence for genes with estimates of neutral divergence highlight the roles of both drift and selection in driving population structure for functional traits. Furthermore, the action of selection on the transcriptome demonstrated in this study identifies a possible genetic mechanism for the process of local adaptation. Divergence among natural populations for functional traits gives real weight to neutral marker population structure for conservation and management of species. This work reinforces the need to conserve salmonids at the population or tributary level to preserve diversity and adaptive potential in functional traits and also highlights the importance of transcriptional evolution to the processes of local adaptation and speciation.
We thank Dana Atagi, Jeff Lough, Joe De Gisi and Paddy Hirschfield at the BC Ministry of Forests, Lands and Natural Resource Operations in Smithers, BC for their logistical support in the field and assistance in the collection of samples. We thank Brad Thompson for providing access to DFO's Fulton River spawning channels for space to conduct the experiments. We also thank Matt Ouellette for collaborating in the development of the microarray and the Environmental Genomics Facility at the University of Windsor for technical assistance. The comments of two anonymous reviewers greatly improved this manuscript.