Parallel (or convergent) evolution provides strong evidence for a deterministic role of natural selection: similar phenotypes evolve when independent populations colonize similar environments. In reality, however, independent populations in similar environments always show some differences: some nonparallel evolution is present. It is therefore important to explicitly quantify the parallel and nonparallel aspects of trait variation, and to investigate the ecological and genetic explanations for each. We performed such an analysis for threespine stickleback (Gasterosteus aculeatus) populations inhabiting lake and stream habitats in six independent watersheds. Morphological traits differed in the degree to which lake–stream divergence was parallel across watersheds. Some aspects of this variation were correlated with ecological variables related to diet, presumably reflecting the strength and specifics of divergent selection. Furthermore, a genetic scan revealed some markers that diverged between lakes and streams in many of the watersheds and some that diverged in only a few watersheds. Moreover, some of the lake–stream divergence in genetic markers was associated within some of the lake–stream divergence in morphological traits. Our results suggest that parallel evolution, and deviations from it, are primarily the result of natural selection, which corresponds in only some respects to the dichotomous habitat classifications frequently used in such studies.
Although parallel or convergent evolution is commonly reported, it is not all pervasive. For example, independent “replicates” of a particular ecotype or ecomorph often show substantial differences in phenotype, as is clear in the above-cited studies and also in more recent work (Landry and Bernatchez 2010; Ożgo 2011; Romero 2011; Rosenblum and Harmon 2011). In the more extreme cases, mean phenotypes of different populations in two environment types show considerable overlap: that is, divergence is sometimes in the “wrong,” meaning unexpected, direction. These nonparallel or nonconvergent aspects to evolution should not be surprising for a number of reasons that we here group into four broad categories: ecological, genetic, functional, and sexual. Although only the first two will be considered in the present study, we introduce them all so as to provide a general overview.
The ecological explanation for phenotypic nonparallelism or nonconvergence is that the classification of environments into discrete types (e.g., high vs. low predation, benthic vs. limnetic, cave vs. surface) ignores selective factors that differ among replicates of a given environment “type.” This explanation can be evaluated by relating trait differences among populations to quantitative differences in important ecological variables, such as prey availability (e.g., Landry and Bernatchez 2010), diet (e.g., Schluter and McPhail 1992; Berner et al. 2008), or the type and abundance of predators (e.g., Endler 1978; Millar et al. 2006). For example, a given predation “regime” (high vs. low) might differ among locations in the type and number of predators (Endler 1978; Reimchen 1994; Reznick et al. 1996), as well as in other environmental factors (Endler 1978; Grether et al. 2001).
The genetic explanation for phenotypic nonparallelism or nonconvergence is that variation among populations in genetic architecture can bias responses to selection. In some cases, this bias might make the difference between evolution toward alternative fitness peaks on a phenotypic adaptive landscape (Schluter 2000; Losos 2009). The potential genetic effects are several. First, different founding populations might have different genetic architectures that favor different evolutionary trajectories (Cohan and Hoffmann 1989; Schluter 1996; de Brito et al. 2005; Simoes et al. 2008). Second, evolution in different populations could occur via different new mutations at the same gene (e.g., Chan et al. 2010) or different genes (e.g., Steiner et al. 2009), either of which could have different phenotypic effects. Third, some populations might be influenced by genetic drift after their founding (Barton and Charlesworth 1984; de Brito et al. 2005). Fourth, replicate populations in a given environment might experience different levels of gene flow from populations in other environments (Hendry and Taylor 2004; Bolnick and Nosil 2007). Fifth, populations can differ in the extent to which phenotypes are plastic (Wund et al 2008), which could increase or decrease trail parallelism or convergence beyond that expected from genotypes. These possibilities can be assessed by relating phenotypic differences to various types of genetic variation.
The main functional explanation for phenotypic nonparallelism or nonconvergence is that fitness is more closely related to integrated aspects of organismal “performance,” such as swimming speed or foraging ability, than it is to any of the traits that contribute to performance (Arnold 1983; Walker 2007; Irschick et al. 2008). Hence, when different trait combinations can generate equivalent performance, parallelism or convergence in function need not entail parallelism or convergence in traits (Alfaro et al. 2005; Wainwright et al. 2005). Indeed, this is one potential source of the alternative fitness peaks mentioned earlier (Schluter 2000; Losos 2009). This explanation can be assessed by relating performance variation to trait variation (e.g., Alfaro et al. 2004, 2005).
The sexual explanation for phenotypic nonparallelism or nonconvergence recognizes that some traits under natural selection are also influenced by sexual selection. These two aspects of selection sometimes can be closely aligned, in which case the contributions of sexual selection could simply reflect and enhance ecological differences. In other cases, however, sexual selection might diverge among populations for reasons not directly related to ecology, such as in sexual conflict (Chapman et al. 2003). The result could be divergence in traits in ways that are not easily predictable from ecological differences. This explanation can be assessed by relating phenotypic differences to measures of, or surrogates for, both natural and sexual selection (e.g., Svensson et al. 2006).
The above explanations for nonparallelism or nonconvergence are not mutually exclusive. Consequently, although parallel or convergent evolution is a strong hallmark of adaptation, nonparallelism or nonconvergence could reflect adaptation, neutrality, or mal-adaptation. The typical over-emphasis on parallel or convergent aspects of phenotypic evolution has thus led to missed opportunities for valuable insights into evolutionary process. Fortunately, recent studies that specifically consider nonparallelism or nonconvergence are begining to generate such insights (e.g., Berner et al. 2008; Landry and Bernatchez 2010; Rosenblum and Harmon 2011; Ożgo 2011).
The present study focuses on lake and stream stickleback from six watersheds on Vancouver Island, British Columbia, Canada. Taking advantage of these independent instances of divergence between lake and stream habitats, we quantify the extent and nature of divergence in ecology (diet), morphology (body shape, armor, and gill rakers), and genetics (microsatellite makers). Although all of our morphological data come from wild-caught individuals, we have reason to believe that the observed patterns have a strong genetic basis, as will be outlined below. In addition, we examine how these different aspects of variation are correlated with each other, with an eye toward informing the drivers of phenotypic divergence and something of its genetic basis.
Material and Methods
POPULATIONS AND SAMPLING
In May 2008, we collected threespine stickleback from paired lake and stream sites in each of six independent watersheds on Vancouver Island, British Columbia, Canada (12 sites in total). The specific sites (Fig. 1 and Appendix S1) were chosen based on prior evidence of (1) strong lake–stream divergence in morphology, and (2) independent origins from marine ancestors (Hendry and Taylor 2004; Berner et al. 2008, 2009). For each site, we used unbaited minnow traps to capture and retain 40 individuals (Appendix S1). These fish were euthanized with MS222, and their left side was photographed with a digital camera (Nikon D100 Nikon Inc., New York). We then weighed each fish (±0.01 g) and preserved part of the pectoral fin in 95% ethanol. To enable isotope analyses (see below), white muscle tissue from the back of each fish was preserved on ice and later dried in an oven for 36 h at 72˚C. Each fish was also dissected to determine sex and to remove the stomach for diet analysis. The stomach (and its contents) and the remaining carcass were preserved separately in 95% ethanol.
We estimated ecological divergence between lake and stream fish based on their diet. This indirect approach to inferring divergent selection was chosen because comparisons of prey “availability” are compromised by the necessarily different sampling procedures in lakes versus streams. Our use of this indirect method is supported by studies showing strong associations between prey availability, diet, and morphological traits in lake stickleback (Gross and Anderson 1984; Schluter and McPhail 1992; Robinson 2000; Ingram et al. 2011). In short, diets are a good way of capturing the divergent selection pressures that ultimately drive morphological evolution in threespine stickleback (Schluter and McPhail 1992; Bolnick et al. 2008; Snowberg and Bolnick 2008; Berner et al. 2008, 2009). To consider short-term diets, food items from each stomach were categorized as either limnetic or benthic following Schluter and McPhail (1992). For each individual, we then estimated the “proportion of limnetic prey” (PLP) as the number of limnetic prey items divided by the total number of identified prey items (limnetic plus benthic) (Schluter and McPhail 1992). Following previous studies (Berner et al. 2008, 2009), we expected limnetic prey to be more important (relative to benthic prey) for lake populations that stream populations.
To consider long-term diets and ecological niches (i.e., resource and habitat), we used stable isotopes (Post 2002; Newsome et al. 2007). We specifically estimated (1) the relative importance of different sources of primary production for each individual (δ13C), and (2) the trophic position of each individual (δ15N). This inferential method is commonly applied in studies of stickleback diet divergence (Bolnick et al. 2008; Reimchen et al. 2008; Snowberg and Bolnick 2008; Matthews et al. 2010). Weighed and dried muscle samples were sent to the UC Davis Stable Isotope Facility, where isotopes were measured with a PDZ Europa ANCA-GSL elemental analyzer, interfaced to a PDZ Europa 20–20 isotope ratio mass spectrometer (Sercon Ltd., Cheshire, UK). The resulting δ13C and δ15N values were expressed relative to international standards: PDB (PeeDee Belemnite) for carbon and air for nitrogen. Lipid often has a lower δ13C than do other biochemical tissues (Post et al. 2007; Logan et al. 2008), and so we lipid-corrected our δ13C values using Post et al. (2007, eq. 3): δ13Ccorreted=δ13Cuntreated– 3.32 + 0.99 × C:N, where C:N is the ratio of the mass of carbon to nitrogen.
Using the isotope data, we estimated the trophic position of each individual and the mean trophic position for each population. We controlled for environmental sources of variation in carbon and nitrogen by analyzing snails and mussels and then applying the appropriate formula in Post (2002). Specifically, Tpos=λbase+[δ15Nfish– (αlittoral×δ15Nsnails+ (1 –αlittoral) ×δ15Nmussels)]/ΔN, where αlittoral is the proportion of littoral carbon for an individual ([δ13Cfish–δ13Cmussels]/[δ13Csnails–δ13Cmussels]), λbase is the trophic position of the baseline primary consumer (λbase= 2), and ΔN is the trophic fractionation (ΔN= 3.4‰, from Post 2002). In the few sites, where mussels or snails were unavailable (mussels: Misty Inlet; snails: Misty Lake and Inlet and Beaver Lake), we applied a correction based on slopes and intercepts obtained from the other populations (see Appendix S2 for details). Given that trophic position is essentially a linear transformation, or “corrected” measure, of δ15N, we present the results for trophic position but not δ15N. Given the expectation of different stickleback diets in lakes and streams (see above), we expected consistent divergence in stable isotopes between lake and stream stickleback.
We chose several morphological traits frequently subject to divergent selection between freshwater environments: body shape, armor traits (plates and spines), and trophic traits (gill raker number and length). Although plasticity can contribute to these traits (e.g., Sharpe et al. 2008; Wund et al. 2008), the patterns seen in wild-caught fish are expected on several grounds to have a strong genetic basis. First, three independent sets of experiments with stickleback from the Misty watershed have shown that lake–stream differences in body shape, spine length, and gill raker traits are almost perfectly maintained through multiple generations in the laboratory (Lavin and McPhail 1993; Hendry et al. 2002, 2011; Sharpe et al. 2008; Berner et al. 2011). Second, a large genetic component for differences in these same traits has been documented in a number of other threespine stickleback comparisons, such as benthic versus limnetic (e.g., Hatfield 1997) and freshwater versus anadromous (e.g., Schluter et al. 2004). Third, QTL studies have found strong genetic determinants for population differences in these same traits (e.g., Peichel et al. 2001; Colosimo et al. 2004; Cresko et al. 2004; Shapiro et al. 2004; Albert et al. 2008). We will thus proceed under the assumption that among-population patterns have a strong genetic basis, while acknowledging that we cannot provide formal confirmation of this beyond the Misty system.
We counted lateral plates on the left side of each fish and used digital calipers to measure the lengths of the left pelvic spine and the first and second dorsal spines along the anterior side of the spine (Fig. 2). Each trait was measured three times, repeatability was high (correlation between pairs of measurements was ≥ 0.98); to further ensure accuracy, the mean of the three measurements was used for analysis.
Gill raker traits
Gill raker traits are closely related to diet: gill raker length and number are often lower for fish foraging more often on benthic (macro-invertebrate) prey in littoral habitats than for fish foraging more often on limnetic (zooplankton) prey in the open water of lakes (Bentzen and McPhail 1984). These differences are thought to arise because longer and more numerous gill rakers are better suited for feeding on smaller food items, whereas fewer and shorter gill rakers are better suited for feeding on larger food items (Hagen and Gilbertson 1972; Bentzen and McPhail 1984; Schluter and McPhail 1992; Bolnick 2004; Matthews et al. 2010). Given that lake and stream stickleback differ in their degree of limnetic versus benthic foraging, it is not surprising that they often show divergence in gill raker traits (Hendry and Taylor 2004; Berner et al. 2008, 2009). Based on this previous work, we expected consistent lake–stream divergence in gill raker traits, although previous work suggested it would be stronger for gill raker number than for gill raker length (Hendry and Taylor 2004; Berner et al. 2008, 2009).
Following Berner et al. (2008, 2009), we counted the gill rakers on the left gill arch and measured the length of the second to fourth gill raker from the epibranchial-ceratobranchial joint on the ceratobranchial. Each gill raker was measured three times at 45× magnification on a stereomicroscope with a micrometer (maximal precision of 0.01 mm). The mean was calculated from the three measurements per gill raker and across the three gill rakers. Each trait was measured three times, repeatability was very high (correlation between pairs of measurement was ≥ 0.98); to further ensure accuracy, the mean of the three measurements was used for analysis.
Trait standardization and divergence metrics
Some of the measured traits are correlated with body size, and so allometric size standardizations were applied. For body shape, this was done by including centroid size as a covariate in the analyses. For the other traits, this was done by calculating MSTD=M0 (63.53 /CS0)b (Reist 1986), where b is the common within-group slope from ANCOVA of log10-transformed trait size on log10-transformed centroid size, M is the measured trait size, CS is body (centroid) size, and 63.53 is the mean centroid size across all individuals in the dataset. The response variables in ANCOVA were log10-transformed trait sizes, and the predictor variables were population, centroid size, and their interaction. The interactions were nonsignificant in all cases and so were removed to calculate b (Reist 1986).
Statistically, we first examined how the total morphological variation could be partitioned among effects of watershed, habitat type (lake or stream), body size (centroid), and all possible interactions. Habitat type and watershed were both fixed effects as this allowed more direct comparison of effect sizes and because the specific sites were chosen based on a priori information (see above.) The analysis involved two multi-variate analyses of covariance (MANCOVAs), one for body shape and one for the linear measurements. For body shape, response variables were the two uniform components and the 16 partial warps. For linear measurements, response variables were gill raker size, gill raker number, pelvic spine length, and plate number. We then also used two-way ANOVA to estimate the effect of watershed and habitat on several key traits: relative warp 1 (RelW1) from the body shape data (more details below), gill raker number, gill raker length, plate number, dorsal and pelvic spine length. In all of these analyses, the traits were size-standardized as described above.
We next estimated the proportion of the variance among populations relative to the total variation. This estimate was similar to Qst (Lande 1992; Spitze 1993), but was based on phenotypic measurements and so is referred to as Pst. The calculation follows Pst=δ2GB/(δ2GB+δ2GW), where δ2GB and δ2GW are the between population and within population variance components for a given trait. Pst has particular value because it allows direct comparisons across traits and population types. First, it is directly analogous to the common genetic divergence metric Fst. Second, it is a unit-less proportional measure and so is easy to compare across variables and populations with very different units, measurement scales, and levels of within-population variance. Estimation of variance components was done in R using the lmer function where the responses were the phenotypic traits and population was a random effect.
DNA was extracted from the fin samples using the Wizard® SV 96 Genomic DNA Purification System (Promega) kit. DNA concentration was then measured using a Nanodrop spectrophotometer and diluted to 4 ng·μl−1. To first identify candidate chromosomal regions that might be linked to phenotypic differences between lake and stream populations, we pooled the DNA of all individuals within a given population (12 pools) and screened for genetic variation at 192 microsatellite loci distributed across the stickleback genome (Peichel et al. 2001). From these screens, we selected 12 loci that showed little overlap in allele size distributions within at least two of the six population pairs (see Gow et al. 2006 for further explanation and rationale of this method). We then confirmed these hypothesized differences by genotyping eight individuals from each population at those loci. Six loci (Stn45, Stn168, Stn232, Stn246, Stn321, and Stn386; Appendix S3) were found to have little overlap in allele size distributions and were then genotyped on all the individuals for all populations. Chromosome regions singled out in this type of scan are likely under divergent selection (Stinchcombe and Hoekstra 2008), but this method will not detect all such regions, particularly because of the relatively small number of loci (192). As a result, we consider this a first step to quantifying the genetic basis of divergence in lake–stream stickleback.
For ease of presentation, we refer to the above loci as “selected loci,” while remembering that they are probably not themselves under selection but rather could be linked to a chromosomal region under selection. We next established a null expectation for genetic divergence by screening all individuals at six microsatellite loci (Stn34, Stn67, Stn87, Stn159, Stn199, and Stn234; Appendix S3) not tightly linked to any known quantitative trait locus (QTL) and thus likely to experience relatively weak selection. For ease of presentation, we refer to these loci as “neutral loci”, while remembering that it remains possible they might be linked to as yet unknown genes or QTL subject to disruptive selection.
For both selected and neutral loci, we used FSTAT (Goudet 1995) to estimate allelic richness for each population. We then used GENEPOP version 4 (Rousset 2008) to test each locus within each population for deviations from Hardy–Weinberg equilibrium (heterozygote deficiency), and to test each locus pair within each population for deviations from linkage disequilibrium. Using Genetix (Belkhir et al. 2004), we estimated average heterozygosity of the neutral loci in each population and the overall average for each watershed. Also, between the lake and stream populations within each watershed, we estimated the average θ (multilocus Fst) across all neutral loci (Weir and Cockerham 1984). As a final conservative screen for those loci most likely under the influence of divergent selection between lake and stream populations, we examined whether θ for a given “selected” locus in a given watershed fell outside the 95% CI for the average θ at neutral loci within that watershed. Despite the abundance of Fst outlier tests in the literature, the small number of loci used in this study precludes methods with more complex neutral models, such as Dfdist (Beaumont and Balding 2004).
ECOLOGICAL, MORPHOLOGICAL, AND GENETIC CORRELATIONS
The above analyses yielded, for each watershed, estimates of lake–stream divergence based on Pst for each of six morphological traits (RelW1, gill raker length and number, dorsal and pelvic spine length, and plate number) and Fst for each of 12 loci. To estimate a comparable ecological divergence metric, we estimated “Est” within each watershed as the proportion of variance in three ecological variables (proportion of limnetic prey, δ13C, and trophic position) attributable to differences between lake and stream environments. As traditionally used, these variance ratios (Pst, Fst, and here also Est) do not indicate the “direction” of divergence. We therefore added a sign to Pst and Est to indicate whether the direction of divergence within a given lake–stream pair was the same as (positive) or the opposite of (negative) the typical direction of divergence across watersheds. For example, gill rakers are generally more numerous in lake than stream fish, and so Pst values for pairs diverging in that (vs. the other) direction would be positive (vs. negative).
We next calculated the correlation across watersheds (N= 6 datapoints for each correlation) between each measure of ecological divergence and each measure of morphological divergence (18 correlations), each measure of ecological divergence and each measure of genetic divergence (36 correlations), and each measure of morphological divergence and each measure of genetic divergence (72 correlations). With 126 correlations to consider, each with relatively low power, tests of statistical significance are questionable and corrections for multiple comparisons are not feasible. We therefore treat these correlations not as hypothesis tests, but rather as an investigative tool to identify promising associations that could be the subject of future research. We suggest that a reasonable way to select these associations is to consider those where at least half of the variance in one could be explained by the other (r2= 0.5, DrD = 0.707107). For completeness, we do also report individual P values for these correlations.
All ecological variables differed among watersheds and between lake and stream habitats, with a significant interaction between watershed and habitat (Table 1; Fig. 3). In each case, at least 50% of the total variation could be explained jointly by these three terms, with the strongest contributions from the watershed and habitat main effects (Table 1). Relative to stream stickleback, lake stickleback ate more limnetic prey, had lower δ13C, and had a higher trophic position. The interaction term was not as strong because lake–stream divergence in these ecological variables was in the same direction in all watersheds (Table 1 and Fig. 3). Instead, any deviations from parallel patterns of divergence were the result of variation among watersheds in the “magnitude” of lake–stream differences.
Table 1. Watershed, habitat, and interaction effects on ecological and morphological variables in general linear models. For morphological traits correlated with body size (gill raker length and dorsal and pelvic spine lengths), results are for allometrically standardized trait sizes. Estimates of variance explained are the proportion of the total variance based on sums of squares (SS): SS effect/SS total.
Total Var. explained
1When including centroid size as a covariate, the full model explains 71% of the variation and all terms were significant. The variance explained by the habitat and watershed terms did not change, but variation explained by the watershed×habitat interaction was reduced to 2.7%. In addition, centroid size explained 11% of the variation and the centroid size by watershed interaction explained 1% of the variation.
Proportion of limnetic prey
Relative warp 11
Gill raker length
Gill raker number
Pelvic spine length
Dorsal spine length
Although the diet and trophic position data are easily interpretable in the context of our earlier expectations, the δ13C data require further explanation. In particular, previous studies within lakes documented a lower δ13C for limnetic than benthic environments (France 1995; Bolnick et al. 2008; Snowberg and Bolnick 2008; Mathews et al. 2010), whereas we documented higher δ13C in lakes (more limnetic) as compared to streams (more benthic). The likely reason for this difference is that water flow increases the availability of CO2 to benthic algae, which may increase fractionation of 13CO2 (Calder and Parker 1973; Pardue et al. 1976; Finlay et al. 1999). In short, the direction of δ13C divergence in our study could be an effect of water velocity (Finlay et al. 1999) and/or food type (Bolnick et al. 2008), but we are unable to disentangle these alternatives at present. Thus, although we report results for δ13C, we do not make much of their interpretation.
Significant watershed, habitat, and watershed-by-habitat interactions were present in the multivariate analyses (Table 2). For the analysis combining univariate traits, the three predictor variables (watershed, habitat, and interaction) explained roughly similar amounts of variation. For the analysis using geometric morphometric body shape variables, the watershed term explained about twice as much of the variation as did the habitat term and the watershed-by-habitat interaction. Overall, much more of the total variation could be explained for geometric morphometric body shape than for the univariate traits. We next describe the results trait-by-trait, given that different traits had different expectations.
Table 2. Results of multivariate analysis of variance (MANCOVA) for linear measurements (gill raker number and size, spines length and plate number: size standardized as needed), and total body shape (all relative warps and uniform components combined). Partial variance explained is η2=SS effect /[SS effect + SS error]=1 – Wilk's lambda, as appropriate for multivariate analyses (Langerhans and DeWitt 2004). Partial variance explained can sum to greater than 100%. In a MANCOVA, η2 is also the effect size.
Partial variance explained (%)
Body (centroid) size
Total body shape
Body (centroid) size
The first relative warp for body shape (RelW1) explained 27.9% of total body shape variance, and higher scores were associated with shallower bodies along the entire length of the body (Fig. 4). This relative warp was very similar to that extracted in other studies of lake–stream stickleback (e.g., Sharpe et al. 2008; Berner et al. 2009, 2010b; 2011). The other relative warps explained considerably less of the variation (RelW2 = 15.5%, RelW3 = 11.7%, and RelW4 = 9.3%) and did not have clear functional interpretations. These relative warps are not analyzed further. RelW1 differed strongly among watersheds and between lake and stream habitats, with a significant interaction (Table 1). The full model explained 61% of the variance, and habitat was by far the most important effect (47% of the total variance). Specifically, lake fish have shallower bodies than do stream fish across and within all watersheds (Table 3; Figs. 4 and 5). The small interaction effect (4%) is the result of differences among systems only in the magnitude of lake–stream divergence. For instance, divergence in body depth was greatest in the Pye system and least in the Boot system (Table 3; Fig. 5).
Table 3. Pst for morphological traits and Est for ecological variables between lake and stream populations within each watershed. Negative signs indicate divergence opposite to the most common direction between lakes and streams.
Proportion limnetic prey
Relative warp 1
Gill raker length
Gill raker number
Pelvic spine length
Dorsal spine length
Gill raker traits differed among watersheds and between lake and stream habitats, with a significant interaction in each case (Tables 1 and 3; Fig. 5). These terms explained much more of the total variation for gill raker number than for gill raker length, possibly because the latter tends to be one of the most variable traits within populations (Berner et al. 2008, 2010a). On average, lake fish had more and longer gill rakers than did stream fish, with the direction of divergence always being the same for gill raker number but not for gill raker length (Table 3; Fig. 5). Thus, for gill raker number, parallelism in lake–stream divergence was high and the interaction term was the result of variation in the magnitude of lake–stream divergence (highest for Pye, Village, and Boot; Table 3; Fig. 5). For gill raker length, however, parallelism was very weak, and the significant interaction term reflected differences among watersheds in both the magnitude and direction of divergence (Table 3; Fig. 5).
The three armor traits differed among watersheds, but consistent differences were not evident between lake and stream habitats, and the lake by habitat interaction was significant in each case (Table 1). Not much of the total variation could be explained by these terms for pelvic or dorsal spine lengths, but 40% could be explained for lateral plate number (Table 1). The importance of watershed and the interaction were roughly equivalent for each of the traits. Parallelism was thus very low for armor traits, with the direction of lake–stream divergence frequently differing among watersheds (Table 3; Fig. 5).
The magnitude (here ignoring the direction) of phenotypic divergence can be qualitatively compared among traits and contrasts (lake-lake, lake-stream, stream-stream) by reference to average pairwise Pst values (Table 3 and Fig. 6). First, gill raker number and RelW1 showed the greatest lake–stream divergence: 2.8 times higher than lake–lake and 5.3 times higher than stream–stream divergence. Second, armor traits showed relatively high lake–lake divergence, whereas gill raker length showed similarly low divergence in all three contrasts. Third, stream–stream divergence for each trait was lower than lake–lake or lake–stream divergence. Overall, then, morphological traits that show the highest level of parallelism in lake–stream divergence (gill raker number and RelW1, as described above) also showed the greatest overall magnitude of lake–stream divergence.
Average lake–stream Fst at the six “neutral” loci ranged from 0.045 (Roberts) to 0.192 (Beaver), with an average of 0.108 (Table 4). By contrast, average lake–stream Fst at the “selected” loci identified by our scan (Stn45, Stn168, Stn232, Stn246, Stn321, Stn386) ranged from 0.071 (Village Bay) to 0.395 (Boot), with an average of 0.218 (Table 4). The six neutral markers only rarely showed evidence of departures from Hardy–Weinberg equilibrium: nine of 76 possible tests were individually significant, involving four different loci in six different populations. Similarly, only 17 of 180 tests for linkage disequilibrium were individually significant, and these were not associated with particular locus pairs or populations. By contrast, 29 of 76 possible tests for departures from Hardy–Weinberg equilibrium were individually significant for the selected loci, further supporting the ongoing action of selection. In particular, Stn321 and Stn246 showed strong departures from Hardy–Weinberg equilibrium in eight and nine populations, respectively. Similar to neutral loci, however, only 18 of 180 tests for linkage disequilibrium were individually significant for the selected loci, which is expected given that these loci are on different linkage groups (Appendix S3).
Table 4. Fst for each “selected” locus (see text) between lake and stream populations within each watershed. Average values and 95% confidence intervals (CI) are shown for both selected and neutral loci. Also indicated are Fst values that are higher (in bold) or lower (in italics) than the 95% CI from Fst values based on neutral loci.
0.059 – 0.181
0.083 – 0.331
0.029 – 0.063
0.041 – 0.098
0.018 – 0.084
0.080 – 0.302
Comparisons of Fst at selected loci to the distribution of Fst at neutral loci (Table 4) yielded several key observations. First, two of the selected loci exhibited high Fst in most watersheds: Stn321 in all six watersheds and Stn168 in five watersheds. The other selected loci appeared to be under divergent selection in only two or three watersheds each. Second, some watersheds (especially Pye) showed evidence of divergent selection at multiple loci, whereas others showed evidence of divergent selection at only one or two loci (i.e., Village Bay or Beaver). Note, however, that this approach to confirming divergent selection is limited in situations where neutral-locus Fst values are already very high (e.g., Beaver).
As noted earlier, the correlations identified here do not represent hypothesis testing but rather hypothesis generation. Five “ecology–morphology” correlations were identified (Table 5; Appendix S4). First, stickleback from watersheds with greater lake–stream divergence in diet (PLP) also showed greater lake–stream divergence in body depth (RelW1) and gill raker number. Second, stickleback from watersheds with greater lake–stream divergence in δ13C showed lower divergence in body depth (RelW1), although this is difficult to interpret in light of the previously mentioned effects of stream flow on δ13C values. Finally, stickleback from watersheds with greater lake–stream divergence in trophic position showed greater divergence in gill raker number and lateral plate number. Three “morphology–genetic” correlations were identified. Two of these correlations involved selected loci. First, stickleback from watersheds with greater lake–stream divergence in Stn321 showed greater divergence in gill raker number. Second, stickleback from watersheds with greater lake–stream divergence in Stn45 showed greater divergence in body depth (RelW1). Four “ecology–genetic” correlations were identified, and one of these involved a selected locus. Specifically, stickleback from watersheds with greater lake–stream divergence in diet (PLP) had greater divergence in Stn45.
Table 5. The strongest observed correlations between ecological, morphological, and genetic variables. The coefficients calculated take into account the direction of lake–stream divergence in each system, as indicated in Table 3. Locus names with an asterisk are expected to be neutral.
Pairs of variables
Ecology vs. Morphology
Proportion limnetic prey
Relative warp 1
Gill raker number
Relative warp 1
Proportion limnetic prey
Gill raker length
Morphology vs. Genetics
Relative warp 1
Gill raker number
Ecology vs. Genetics
Proportion of limnetic prey
We quantified the extent to which divergence between lake and stream stickleback within watersheds was parallel (or convergent) versus nonparallel (or nonconvergent) across those watersheds. Ecological variables related to foraging (diet and stable isotopes) generally showed a reasonable degree of parallelism, with more benthic diets in streams than in lakes: 21–28% of the total diet variation among all individuals could be explained by the lake–stream contrast. Morphological traits showed varying degrees of parallelism, ranging from very low (∼0% by the above index for armor traits) to very high (47% for body depth, RelW1), and some of the nonparallelism could be explained by ecological variables. (Although these patterns were based on wild-caught fish, previous work suggests they have a strong genetic basis—see Methods.) Finally, we found varying degrees of similarity in lake–stream divergence at genetic markers, some of which was associated with divergence in morphological traits.
Several lines of evidence indicate that lake–stream divergence for body depth and gill raker number are strongly parallel. First, lake–stream divergence was in the same direction in all watersheds (Table 3 and Fig. 5). Second, the habitat term explained a substantial amount of the total variation among all individuals: 47% for body depth and 30% for gill raker number. Third, the habitat term was considerably more important than the watershed or interaction terms (Table 1). The strong parallelism shown by these traits is consistent with previous work on lake and stream populations, and is thought to reflect natural selection related to foraging mode and diet (Reimchen et al. 1985; Lavin and McPhail 1993; Walker 1997; Hendry and Taylor 2004; Berner et al. 2008, 2009; Aguirre 2009). Our analysis further supports this interpretation because lake–stream divergence in both body depth and gill raker number was positively associated with lake–stream divergence in diet as assayed by stomach contents and stable isotopes (Tables 3 and 5).
Despite generally high parallelism for these two traits, some nonparallelism was still present and calls for explanation. In particular, both the watershed and interaction terms were highly significant (4–11% of the total variation), and at least 39% of the variation remains unexplained (Table 1). The interaction term reflects differences among watersheds in the extent to which stream fish have fewer gill rakers and deeper bodies than do lake fish, and these differences are likely the result of differences in the strength of divergent selection related to diet. Supporting this contention, lake–stream divergence in body depth was closely associated with lake–stream divergence in the proportion of limnetic prey (r= 0.922) and lake–stream divergence in gill raker number was closely associated with lake–stream divergence in trophic position (r= 0.810). These associations match those reported in other studies of lake–stream divergence in stickleback (Berner et al. 2008), benthic–limnetic divergence in stickleback (Schluter and McPhail 1992), and benthic–limnetic divergence in other fishes (Landry and Bernatchez 2010). As for the unexplained variation, this likely arises because both habitats, but particularly lakes, have both limnetic and benthic resources, which maintains variation in foraging traits through disruptive selection and individual specialization (Berner et al. 2008; Snowberg and Bolnick 2008; Bolnick and Paull 2009; Berner et al. 2010a). In short, the nonparallelism for generally parallel traits can be explained by variation in the ecological factors thought to influence divergent selection.
Although we have not yet deployed the full power of genomic tools available for stickleback, our initial scan suggests some interesting patterns. Specifically, some marker loci showed high divergence in multiple lake–stream pairs, whereas others showed high divergence in only a few pairs (Table 4). In the first case, Stn321 was considerably more divergent than neutral loci in all six lake–stream pairs, and divergence at this locus has also been recorded for another lake–stream pair (Bolnick et al. 2009). In addition, Stn168 was highly divergent in five watersheds. In the second case, shared patterns of genetic divergence were less evident for Stn246 (three watersheds) and Stn232, Stn45, and Stn386 (two watersheds each). These results suggest that divergent selection between lakes and streams drives divergence in some similar and some different genetic regions. Note, also, that our analysis targeted only loci showing signs of high lake–stream divergence in at least two watersheds (see Methods).
Some of the above genetic markers are likely associated with the traits we measured because (1) some are linked to QTL identified in other studies (Appendix S3), and (2) some showed levels of lake–stream divergence that were associated with lake–stream divergence in morphological traits (Table 5). Most notably, Stn45 has been associated with QTL for body shape (Albert et al. 2008), and its divergence here appeared to be positively correlated with divergence in body depth (r= 0.830) and diet (r= 0.718). (The latter correlation was probably indirect, reflecting the association of diet with body depth and body depth with the marker.) In addition, Stn321 has been associated with QTL for lateral plate number and body shape (Colosimo et al. 2004; Albert et al. 2008), and its divergence here appeared to be positively correlated with divergence in gill raker number (r= 0.782). Whether this last result reflects a novel genetic association, pleiotropy, or simply coincidence remains to be seen, but it does suggest the value of rapid genetic screens such as ours for suggesting novel associations between genotype and phenotype. Our results also highlight more generally the fact that markers identified in linkage mapping studies of particular population pairs are sometimes associated with divergence between population pairs in other contexts (see also Hohenlohe et al. 2010). That the same QTL appear to be important in multiple environmental contrasts suggests that adaptation at least sometimes draws on a shared genetic tool kit.
Larger scale genomic studies are required to comprehensively assess associations between genetic and phenotypic divergence in the lake–stream stickleback system. Any such associations could arise in two ways: (1) differences in genetic architecture could cause phenotypic divergence to deviate from that expected solely under divergent selection; or (2) the strength of divergent selection jointly determines the extent to which both phenotypic and genetic divergence proceeds. The first (bias) possibility invokes the “genetic explanation for phenotypic nonparallelism or nonconvergence” mentioned in the Introduction. The second (no bias) possibility is that phenotypic divergence proceeds as dictated by selection, and genetic divergence occurs to the corresponding extent. We cannot conclusively separate these two alternatives, but suggest the second is more likely given that morphological divergence was so predictable based on divergence in the relevant ecological variables.
Lake–stream divergence in armor traits and gill raker length showed very low parallelism: the habitat term explained essentially none of the variation (Table 1) and the direction of lake–stream divergence differed among watersheds (Fig. 5). For armor traits, differences among populations generally arise owing to differences in selective factors such as predation (Hagen and Gilbertson 1972; Gross 1977; Reimchen 1994; Marchinko and Schluter 2007; Marchinko 2009) and ionic concentration (Giles 1983; Reimchen and Nosil 2006). Presumably reflecting the same factors, we too found strong among-population variation in armor traits. This variation was sometimes manifest as divergence between lake and stream populations within a watershed—but not in a consistent direction or magnitude across watersheds. This result confirms and extends the previous observation of Hendry and Taylor (2004) that “relative to stream fish, lake fish have shorter pelvic spines and more lateral plates in the Mayer and Drizzle watersheds (Moodie 1972; Reimchen et al. 1985) but longer pelvic spines and fewer lateral plates in the Misty watershed (Lavin and McPhail 1993; Hendry et al. 2002).” We suggest that selection from predators does sometimes differ between lakes and streams—just not in a consistent fashion. In short, strong trait parallelism is only expected when selective regimes are very similar.
To test the above hypothesis, we would ideally quantify the difference in predation between lakes and streams and relate this to differences in armor traits. This test is not yet possible, owing to the difficulty of accurately assessing predator-based selection in nature. Interestingly, however, lake–stream divergence in one armor trait (lateral plates) appeared positively correlated with lake–stream divergence in trophic position (r= 0.936). Perhaps this association arose because trophic position reflects ecological niche (e.g., benthic vs. limnetic) and different ecological niches cause different exposure to predators. For instance, Reimchen et al. (2008) documented associations between trophic position and plate number within lakes. Another possibility is that the different foraging environments require different aspects of swimming performance (Hendry et al. 2011), which are influenced by plate number (Bergstrom 2002; Hendry et al. 2011).
In contrast to body depth and gill raker number, we found no suggestive associations between lake–stream divergence in armor traits and lake–stream divergence in putatively selected genetic markers: that is, markers showing lake–stream divergence in multiple watersheds (see Methods). We did, however, find a suggestive but puzzling negative association between lake–stream divergence at a putatively neutral marker (Stn199) and lake–stream divergence in lateral plate number (r=–0.791). Perhaps this is just a spurious association resulting from the large number of tests—or perhaps Stn199 is linked to an undiscovered QTL. Additional work will be needed to test this possibility. As discussed above for parallel traits, any such association would likely reflect the action of selection, rather than the “genetic explanation for nonparallelism or nonconvergence”—because divergence in both lateral plates and Stn199 was associated with ecological divergence (trophic position; Table 5).
Parallelism (or convergence) was strong for body depth and gill raker number: 47% and 30% of the total phenotypic variation among all individuals could be explained by the simple lake–stream habitat classification (Table 1), and divergence was always in the same direction. These patterns are likely genetic (see Methods), which implies that selection has played a predominant role in the shaping the morphological patterns. We can therefore conclude that when selection is strong and consistently divergent, the relevant traits diverge in a consistent and predictable fashion. Most studies stop with the above positive assertion, but it is just as important to understand aspects of nonparallelism. First, parallelism was not overwhelming even for the above traits. Instead, understanding the degree of lake–stream divergence required quantitative measures of relevant ecological variables: watersheds with greater lake–stream divergence in diets showed greater-lake stream divergence in body depth and gill raker number. Second, armor traits and gill raker length did not diverge consistently between lake and stream environments, and this variation could not be explained by the measured ecological variables. These traits are certainly sometimes under divergent selection, just not consistently so between lakes and streams. We can therefore infer that divergent selection often involves multiple ecological variables that do not always map easily onto a simple habitat contrast, such as lake versus stream, benthic versus limnetic, or high predation versus low predation.
We also provided an initial assessment of the extent to which lake–stream genetic divergence was similar across watersheds, and the extent to which genetic and morphological divergence were correlated. We found some instances of the same marker diverging in multiple watersheds, particularly for markers putatively associated with QTL for the traits we measured. However, we also found some evidence of markers diverging strongly in only some watersheds, suggesting that unique aspects of genetic divergence are involved in adaptive divergence in different watersheds. However, these conclusions are preliminary, owing to the need for a denser array of markers and a more objective screen for genomic divergence. We hope that our analysis, along with other recent studies (e.g., Berner et al. 2008; Landry and Bernatchez 2010; Rosenblum and Harmon 2011; Ożgo 2011), reinforces the value of quantifying the degree of trait parallelism or convergence and nonparallelism or nonconvergence, and then exploring ecological and genetic correlates thereof.
Associate Editor: U. Candolin
We are grateful to J.S. Moore for his help in the field and Western Forest Products for accommodation in Port McNeil. We also thank S. Barrette, C. Macnaugthon, and S. Muttalib for their help in morphological measurements, and the entire Peichel lab for their help with the genetic analyses. D. Berner had many insightful comments on the manuscript and gave permission to use his stickleback drawings. Additional improvements were made based on comments by M. Bell and five other referees. This work was financially supported by an FQRNT postdoctoral fellowship (RK), an NIH grant P50 HG002568 (CLP), a David and Lucille Packard Foundation fellowship (DIB), the Howard Hughes Medical Institute (DIB), and NSERC (APH).