GLOBAL ANALYSIS OF GENES INVOLVED IN FRESHWATER ADAPTATION IN THREESPINE STICKLEBACKS (GASTEROSTEUS ACULEATUS)

Authors


Abstract

Examples of parallel evolution of phenotypic traits have been repeatedly demonstrated in threespine sticklebacks (Gasterosteus aculeatus) across their global distribution. Using these as a model, we performed a targeted genome scan—focusing on physiologically important genes potentially related to freshwater adaptation—to identify genetic signatures of parallel physiological evolution on a global scale. To this end, 50 microsatellite loci, including 26 loci within or close to (<6 kb) physiologically important genes, were screened in paired marine and freshwater populations from six locations across the Northern Hemisphere. Signatures of directional selection were detected in 24 loci, including 17 physiologically important genes, in at least one location. Although no loci showed consistent signatures of selection in all divergent population pairs, several outliers were common in multiple locations. In particular, seven physiologically important genes, as well as reference ectodysplasin gene (EDA), showed signatures of selection in three or more locations. Hence, although these results give some evidence for consistent parallel molecular evolution in response to freshwater colonization, they suggest that different evolutionary pathways may underlie physiological adaptation to freshwater habitats within the global distribution of the threespine stickleback.

Understanding the genetic processes involved in adaptation has challenged biologists for decades (Storz 2005). Determining if adaptations have arisen from standing variation or new mutations, and whether single major genes of large effect or several minor genes of small effect shape adaptive pathways has become the focus of much evolutionary genetic work (Orr 2005; Mackay et al. 2009). Numerous studies have addressed these questions with investigations of parallel evolution—or independent development of similar phenotypic traits in different lineages exposed to similar environmental conditions—and have begun to elucidate the genetic underpinnings of adaptive responses in several organisms (Baxter et al. 2008; Feldman et al. 2009; Gross et al. 2009). The threespine stickleback (Gasterosteus aculeatus) has proven to be an attractive candidate for studies of this nature, as ancestral populations of marine sticklebacks independently invaded and colonized freshwater habitats across the Northern Hemisphere (Bell and Foster 1994). These postglacial colonizations have been accompanied by obvious parallel changes in morphological characteristics (e.g., Colosimo et al. 2005, Miller et al. 2007). The availability of genomic resources for this species is now allowing a closer examination of the molecular mechanisms that underlie these changes (Kingsley and Peichel 2007). For instance, a recent study demonstrated that selection on standing genetic variation at ectodysplasin gene (EDA) is responsible for the parallel evolution of armor plate reduction (Colosimo et al. 2005). Similarly, repeated regulatory mutations at the Kit ligand (Kitlg) gene are suggested to be responsible for the parallel evolution of changes in skin and gill pigmentation in several North American freshwater populations (Miller et al. 2007). Furthermore, independent evolution of pelvic reduction is caused by pituitary homeobox 1 gene (PITX1), not only in threespine sticklebacks but also in other species of sticklebacks and distantly related vertebrate lineages (Shapiro et al. 2006).

Alternatively, similar morphological or physiological systems can evolve in independent lineages through the interaction of several genes, each of small effect (Wainwright et al. 2005). Quite often, these genes are individual parts of developmental networks (Arendt and Reznick 2008). As such, similar adaptive changes may be achieved through selection on different genes and/or different genetic pathways. Theoretical and experimental studies continue to uncover examples of beneficial mutations of both small and large effect (Orr 2005). Despite these advances, much of the molecular mechanisms driving adaptation still remain poorly understood, and descriptions of patterns of genomic variation are continuing to provide important information on the genes responsible for such changes.

Hitchhiking mapping has become a popular tool to identify genes and genomic regions involved in adaptive population differentiation (Nosil et al. 2009). By screening large numbers of markers across the genome, those “outlier” loci that likely underlie selected traits can be identified (Storz 2005). Traditionally, studies have used neutral or random markers that fall within noncoding regions of the genome (Bonin 2008) and therefore tend to uncover low levels of differentiation (Schlötterer 2002). Some studies have used phenotypic information to identify signatures of selection in genes that underlie quantitative traits (QTLs; Cano et al. 2006, Raeymaekers et al. 2007). An alternative approach is the targeted genome scan that screens functionally important genes, which is starting to demonstrate an increased efficiency in detecting genes involved in local adaptation (e.g., Shikano et al. 2010; Shimada et al. 2011). A recent study by Shimada et al. (2011) applied the targeted genome scan approach to identify candidate genes involved in physiological freshwater adaptation of sticklebacks by screening 157 genes with known physiological functions. Although this strategy of screening functionally important genes uncovered a high proportion of signatures of directional selection (Shimada et al. 2011), inferences of adaptive divergence were nonetheless restricted to a regional geographic area where all populations belong to the same evolutionary lineage (Mäkinen and Merilä 2008). Hohenlohe et al. (2010) also identified several candidate genes involved in adaptive evolution using next-generation sequencing technology, however these results were similarly confined to a small geographic scale. Identifying parallel patterns of selection on genes at a broader spatial scale provides the opportunity to further explore the evolutionary consistency and distribution of adaptive divergence at the molecular level, and allows for comparisons of local adaptation on various hierarchal levels.

In this study, we screened 26 microsatellite loci for physiologically important genes and 24 reference genomic regions in paired marine and freshwater populations from six locations across the global distribution of threespine sticklebacks—encompassing both divergent Atlantic and Pacific stickleback clades (Orti et al. 1994)—to explore genes involved in freshwater colonization on a global scale.

Materials and Methods

Paired marine and freshwater populations from six locations across the Northern Hemisphere—encompassing both divergent Atlantic and Pacific stickleback clades (Orti et al. 1994)—were used in this study (Table 1; Fig. 1). Populations were chosen to reflect independent replicates of freshwater colonization events in all major oceans within the stickleback distribution range. Twenty-six microsatellite markers within or close to (<6 kb) physiologically important genes—previously indicated to be under directional selection in European populations (Shimada et al. 2011; Table S1)—were screened, using two outlier analyses (see below). Half of these markers included genes with osmoregulatory functions—previously identified through extensive fish physiology literature searches (Shimada et al. 2011)—assumed to play important roles in freshwater adaptation. Additional gene-based markers included growth, thermal response, and other functions potentially associated with freshwater colonization (Shimada et al. 2011). Also, a QTL-based marker for EDA (Stn365; Colosimo et al. 2005) was screened as a positive control in outlier analyses (cf. Mäkinen et al. 2008a). Because outlier tests identify putatively neutral and selected loci based on the distributions of genetic parameters in a set of markers used, outlier analyses with the gene-based markers alone can overlook footprints of selection. Therefore, 24 reference microsatellite markers distantly located from any gene (>10 kb; Table S1) were also included into the outlier analyses. In total, 50 microsatellite markers were genotyped for 24 individuals per population.

Table 1.  Information on sampling sites and basic genetic parameters estimated from 48 loci.
CodeLocationPopulationHabitatEcotypeCoordinates n AR HE FIS FST
  1. AR= allelic richness; HE= expected heterozygosity; FIS= fixation index. Divergence estimates (FST) between population pairs are indicated in the first row for each pair.

NA1Pacific North AmericaLittle CampbellMarineAnadromous/marine49°01′N, 122°47′W2410.10.7190.0640.246
NA1 British Columbia, Canada Misty Lake Lake Freshwater 50°36′N, 127°28′W 24  6.0 0.561 0.042  
NA2Atlantic North AmericaPenescot BayMarineAnadromous/marine44°22′N, 68°54′W24 8.10.6410.0980.040
NA2 Quebec, Canada St. Lawrence River River Freshwater 46°40′N, 71°51′W 24  7.5 0.613 0.059  
EU1North SeaIsland of TexelMarineAnadromous/marine53°05′N, 04°50′E24 8.30.6410.0880.112
EU1 Germany Hunte, Weser River Freshwater 52˚53′N, 08˚26′E 24  5.7 0.599 0.051  
EU2Atlantic Western EuropeOrrevannetMarineAnadromous/marine58°44′N, 05°31′E24 8.10.6350.0420.157
EU2 Norway Myrdalsvannet Lake Freshwater 60°19′N, 05°22′E 24  4.0 0.513 0.005  
EU3Barents SeaBarentsMarineMarine/anadromous71°38′N, 29°47′E24 6.90.6030.0520.125
EU3 Finland Pulmankjärvi Lake Freshwater 69°58′N, 27°58′E 24  6.4 0.586 0.064  
JA1Pacific JapanShiomiMarineAnadromous/marine43°02′N, 144°51′E24 7.30.6620.0730.112
JA1 Japan Nishikitappu River Freshwater 42°37′N, 141°29′E 24  4.0 0.525 0.064  
Figure 1.

Map of sampling locations and population codes. Blue points indicate freshwater locations; black points indicate marine locations. Black segments in pie charts indicate gene-based markers under selection; gray segments indicate neutral gene-based markers; red segments indicate reference markers under selection; pink segments indicate neutral reference markers.

All molecular work was performed following Shimada et al. (2011). Basic genetic parameters (heterozygosity, FIS and FST) were calculated using FSTAT 2.9.3.2 (Goudet 2001). Pairwise population differentiation was estimated with Weir and Cockerham's θ (Weir and Cockerham 1984) using 10,000 permutations. Allelic richness was determined using HP-RARE 1.1 (Kalinowski et al. 2005) based on a sample size of 24 individuals. Outlier tests were performed in marine and freshwater populations from each location in a pairwise fashion. Loci affected by directional selection are expected to show higher interpopulation differentiation and lower intrapopulation variability than neutral loci (Storz 2005); therefore, identification of putative loci under directional selection was achieved with the FST- (DETSEL; Vitalis et al. 2003) and heterozygosity-based (ln RH; Kauer et al. 2003) methods designed for pairwise comparisons. These tests were performed following Shimada et al. (2011).

Signatures of selection for Stn365 (a locus associated with EDA; Colosimo et al. 2005) were expected to be detected in four locations (NA1 and EU1–3) where freshwater populations had low or partially plated morphs, differentiated from their fully plated marine pairs. Because the remaining two freshwater populations (NA2 and JA1) were fully plated morphs, signatures of selection for this locus were not expected in these locations. Detection of directional selection at Stn365 in the expected populations was therefore used as a threshold for remaining tests. The observed ln RH values were standardized by the mean and the standard deviations of their respective population pair (Kauer et al. 2003). Following the convention of similar explorative genome scan studies (e.g., Vasemägi et al. 2005; Bonin et al. 2006; Oetjen et al. 2010), significance levels of outlier tests were not adjusted with Bonferroni corrections. Furthermore, because we tested multiple pairs of populations independently, the criterion of “repeated outliers” has been suggested to significantly decrease the likelihood of false positives due to type I error (Bonin et al. 2006; Egan et al. 2008; Nosil et al. 2008). Briefly, the probability of a locus being falsely detected is substantially reduced by the repeatability in each pairwise population test (Nosil et al. 2008).

Results

Across the 50 loci, the number of alleles ranged from 2 to 73 and the expected heterozygosity from 0.004 to 0.918 (Table S1). Deviations from Hardy–Weinberg equilibrium were detected in the loci CLCN4 and MKP1a in some populations after Bonferroni corrections, suggesting the presence of nonamplifying alleles. Hence, these two loci were excluded from further analyses. Allelic richness and expected heterozygosity at the remaining 48 loci were lower in freshwater populations (4.0–7.5 and 0.513–0.613) than marine populations (6.9–10.1 and 0.603–0.719; Wilcoxon test, P= 0.027 and P= 0.028, respectively; Table 1). Average FST estimates between population pairs ranged from 0.040 (NA2) to 0.246 (NA1) among locations (Table 1).

Outlier analyses with DETSEL and ln RH identified signatures of selection at 24 and 11 loci in at least one location, respectively (Table 2). Among the DETSEL results, 17 markers for physiologically important genes and seven for reference genomic regions (including Stn365) were detected as outliers. Eight markers for physiologically important genes and three for reference genomic regions were detected as outliers by the ln RH method. Signatures of selection for Stn365 were detected in the expected four locations (i.e., NA1 and EU1–3) with at least one of the outlier detection methods. The highest incidence of selection (10 loci) was found in NA1 and lowest (five loci) in NA2 (Table 2). Although none of these loci were common to all locations, 13 loci—including 10 loci associated with physiologically important genes—were indicated to be under selection in multiple locations (Table 2). In particular, eight of these loci (seven physiologically important genes and Stn365) showed signatures of selection in three or four locations. Specifically, five are involved in osmoregulation (AQP3, ATP1A1, Kir2.2, NCC, and RHOGTP8), one in thermal response (FERH1) and one in the nest building glue protein spiggin (SPG1; Table 2). Among the reference markers, only two (Stn73 and Stn254) showed signatures of selection in two locations (Table 2). There were no clear geographic (i.e., Atlantic, Pacific or EU-specific) patterns of the incidence of selection in these eight outlier loci, however, two loci (AQP3 and Kir2.2) showed freshwater habitat-specific patterns of selection (lake and river, respectively). Signatures of selection within the nine genomic regions corresponding to the major peaks of parallel differentiation discovered by Hohenlohe et al. (2010) were compared between the two studies, along with those identified by Shimada et al. (2011; Table S2). Only three common regions were screened by all three studies. Signatures of selection were identified in two of these regions (linkage group I; 21543–21711 kb and linkage group IV; 19899–21038 kb) by all studies (Table S2). None of the remaining six genomic regions screened by Shimada et al. (2011) and Hohenlohe et al. (2010) contained common signatures of selection (Table S2).

Table 2.  List of loci showing signatures of selection in at least one location. *P < 0.05, **P < 0.01.
Physiological functionLocusNA1NA2EU1EU2EU3JA1
DETSELln RHDETSELln RHDETSELln RHDETSELln RHDETSELln RHDETSELln RH
Osmoregulation AQP3 *-----**-*---
Osmoregulation ATP1A1   - - -   - - -   - - -
Osmoregulation ATP2C1 **---------*-
Osmoregulation CASR - - - - - -    - - - -
Osmoregulation GTF2B ------*-----
Osmoregulation Kir2.2 - -   -   - - - - -   
Osmoregulation MKP8 --*-*-------
Osmoregulation NCC -     - - - -   - - -
Osmoregulation RHOGTP8 *---**-----*-
Growth MYOD   - - - - - - - - - - -
Growth NPYP ----------*-
Thermal response FERH1   -   - - -    - - - -
Thermal response HSP90B --**--*-----
Maturation ESR1 - - - - - -   - - - - -
Pigmentation DCT --------****--
Smelling TAAR - - - -    - - - - - -
Spiggin SPG1 **---****-----
Plate morph (reference) Stn365   - - -   -   - -   - -
Reference marker Gac4174PBBE ------*-----
Reference marker Stn3 - - - - - - - - - -   -
Reference marker Stn73 ----****--****--
Reference marker Stn132 - - - - - - - - - -   -
Reference marker Stn254 ***------*---
Reference marker Stn321 - - - - - - - -   - - -

Discussion

This study revealed that a large proportion of markers within or close to physiologically important genes—previously demonstrated to be under directional selection in European populations of threespine sticklebacks (Shimada et al. 2011)—also showed signatures of selection in populations across their global distribution. Although an investigation of freshwater adaptation in Alaskan threespine stickleback populations (Hohenlohe et al. 2010) identified partly different sets of genes under selection, two genomic regions common to both studies showed signatures of selection (Table S2). Considering the diversity that exists between different freshwater habitats (Wootton 1984), a wide range of selective pressures can be expected to shape the evolution of adaptive genes in different environments. Conversely, similar selection pressures in different localities may lead to the evolution of different physiological pathways if the initial pool of genetic variation available for selection to act on in the founder populations (e.g., Arendt and Reznick 2008) has differed among each locality for historical/phylogenetic (cf. Orti et al. 1994) reasons. This may be particularly relevant in the context of complex physiological systems, which often involve many genes of small effect. Accordingly, within the large number of functionally important genes screened in this study—including those responsible for osmoregulation, growth, thermal response, and other functions potentially related the process of freshwater colonization (Shimada et al. 2011)—no gene showed a consistent signature of selection in all population pairs. This likely reflects the variability and complexity of adaptation to freshwater environments, and suggests that selective pressures and/or their primary targets involved in freshwater colonization are not homogenous.

Nevertheless, our analysis of physiologically important genes demonstrated that several outliers were detected in multiple locations. Although these outliers were detected by both analytical methods used (DETSEL and ln RH), a higher incidence of selection was observed with the DETSEL (Table 2). This discordance is likely to be attributable to the fact that although the ln RH method solely measures reductions of diversity, DETSEL takes into account also the degree of differentiation among populations. Therefore, loci with few alleles showing high degree of differentiation but little reduction in diversity would not be detected as outliers with the ln RH method. The loci detected as outliers with both methods in three or more population pairs (“repeated outliers”; Egan et al. 2008; Nosil et al. 2008) were strictly gene-based (33% of outliers) suggesting strongly that these loci are likely to play a role in freshwater colonization of threespine sticklebacks. Based on their biological functions (e.g., Evans et al. 2005; Cutler et al. 2007), five of these genes (AQP3, ATP1A1, Kir2.2, NCC, and RHOGTP8) are likely to be responsible for osmotic adaptation. All of these genes code for transport proteins that influence the capacity of teleost fish to respond to changes in environmental salinity (McCormick 2001; Pritchard 2003). The sodium pump (ATP1A1; McCormick 2001), sodium/chloride co-transporter (NCC; McCormick 2001), potassium inward rectifier (e.g., Kir2.2; Suzuki et al. 1999), and Rho GTPase (e.g., RHOGTP8; Evans et al. 2005) are all major ion transporters that play key roles in the active secretion of ions across fish gills—as is the case with marine fish. AQP3 is an aquaporin involved in the active uptake of ions (e.g., Cutler et al. 2007), an important role in freshwater fish osmoregulation. The two other repeat outliers (FERH1, SPG1) are known to be involved in acclimation to cold temperature in teleosts (FERH1; Yamashita et al. 1996) and nest-building behavior in threespine sticklebacks (SPG1; Kawahara and Nishida 2006). In a recent investigation of genome-wide diversity and differentiation in marine and freshwater populations of Alaskan sticklebacks, Hohenlohe et al. (2010) found several genomic regions with genes involved in freshwater colonization showing consistent patterns of selection in three independent freshwater populations. Two repeat outliers identified in our study (ATP1A1 and SPG1) fell within two of these genomic regions (linkage group I; 21543–21711 kb and linkage group IV; 19899–21038 kb, respectively; Table S2). It is important to note that, although the freshwater populations used in Hohenlohe et al. (2010) represented independent colonizations, they were nevertheless in a close geographic distance to one another and presumably colonized by the same ancestral marine populations. Hence, the selected alleles were likely generated from the standing genetic variation common to Alaskan marine populations.

Assuming similar genetic mechanisms evolved in response to the uniform selective pressure of salinity on osmoregulatory systems, detection of selected loci at genes with putative osmoregulatory functions would be expected to be repeated in all independent pairs of marine and freshwater populations. Although we found a high incidence of repeated outliers within the osmoregulatory genes screened, no gene fully conformed to this expectation. Similarly, Hohenlohe et al. (2010) uncovered several “private signatures” of differentiation in single populations, suggesting the involvement of different physiological systems and genes in adaptation to freshwater environments. Furthermore, the complex processes involved in response to osmotic changes are known to require the functions of multiple transport proteins and signaling pathways (Pritchard 2003). As such, osmotic adaptation can be achieved with alternative genes contributing to the various processes of osmoregulatory function. Our findings of several key genes with osmoregulatory roles lend support to the theory that complex physiological processes are likely to be influenced by several genes, each with a small contribution to the ultimate function (Orr 2005; Wainwright et al. 2005). Thus, the diversity between marine and freshwater environments—and even within different freshwater habitats—may pose selective pressures on multiple genes with a range of biological functions. Aside from the differences in osmoregulatory responses between fish of marine and freshwater habitats, anadromous fish that migrate between environments of varying salinity face additional osmoregulatory challenges that require slower responses such as differentiation of transport epithelia and synthesis of new transport proteins (McCormick 2001). Similar results of different genes producing analogous adaptive responses have been demonstrated both within and between species (Hoekstra and Nachman 2003). This is particularly relevant in the context of the freshwater invasions of threespine sticklebacks, which have occurred independently (Bell and Foster 1994): different genes are likely to have been involved in adaptive differentiation in different geographic regions, and in different phylogenetic clades. Even in the event that the same gene is involved, adaptive responses can evolve by the fixation of different alleles (Hohenlohe et al. 2010). These results differ from those of Colosimo et al. (2005) and Miller et al. (2007), who each found a major gene responsible for the parallel evolution of a morphological trait—which both arose from selection on standing genetic variation. Those findings allowed Schluter and Conte (2009) to propose that much of the parallel variation of phenotypic traits in threespine sticklebacks might be generated by this mode of evolution. Although this is a feasible suggestion, supported by their “transporter” hypothesis (Schluter and Conte 2009), it might be worth considering that parallel evolution by selection on standing genetic variation is more likely to occur when the adaptive response is controlled by a single major gene—as is the case with several “simple” phenotypic traits (bony armor: Colosimo et al. 2005; Shapiro et al. 2006; pigmentation: Miller et al. 2007). However, with complex physiological systems such as those involved in freshwater adaptation (Pritchard 2003), it is probable that more genes of smaller effect are involved, which reduces the likelihood of selection acting on the same genes in multiple locations.

The difference between this mode of adaptive evolution and that in which the same allele comes to fixation in independently derived populations represents the categorical discrimination between “hard” and “soft” selective sweeps (Pennings and Hermisson 2006). Although examples of hard sweeps have been found in multiple populations on regional scales (Hohenlohe et al. 2010; Shimada et al. 2011), examining natural populations on a broader spatial scale allows for deeper exploration of the consistency of genomic patterns and can provide valuable information about how different genomic regions in different populations are responding to selection. In fact, the comparison of selective sweeps in linkage group VIII found in Alaskan and Fennoscandian populations revealed a nonparallel sweep: elevated levels of differentiation between freshwater populations suggest different modes of adaptive evolution in different geographic areas (Mäkinen et al. 2008b; Hohenlohe et al. 2010; Table S2).

In general, the results of our study suggest that freshwater diversification of threespine sticklebacks across their global distribution is likely generated by heterogeneous selection pressures among freshwater habitats, acting on multiple key genes each with small effect. As such, the results conform to the view that evolutionary adaptation to similar challenges posed by the environment can be achieved through multiple genetic pathways.


Associate Editor: M. Hellberg

ACKNOWLEDGMENTS

We thank T. Bakker, T. Kitamura, T. Leinonen, A. Levsen, H. Mäkinen, A. Nolte, S. McCairns, C. Macnaughton, A. Hendry, and D. Schluter for help in obtaining samples. Thanks are also due to two anonymous reviewers who provided helpful comments on an earlier version of the manuscript. This study was supported by the Academy of Finland (JM and TS), Finnish Centre of Excellence in Evolutionary Genetics and Physiology, Japan Society for the Promotion of Science (TS and YS), LUOVA graduate school (JD), and BONUS+ program for BaltGene consortium (JM and JD).

Ancillary