SEARCH

SEARCH BY CITATION

Keywords:

  • Adaptation;
  • fitness;
  • genomics;
  • mechanistic biology;
  • population genomics;
  • positive selection

Abstract

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED

Inferences about adaptation at specific loci are often exclusively based on the static analysis of DNA sequence variation. Ideally, population-genetic evidence for positive selection serves as a stepping-off point for experimental studies to elucidate the functional significance of the putatively adaptive variation. We argue that inferences about adaptation at specific loci are best achieved by integrating the indirect, retrospective insights provided by population-genetic analyses with the more direct, mechanistic insights provided by functional experiments. Integrative studies of adaptive genetic variation may sometimes be motivated by experimental insights into molecular function, which then provide the impetus to perform population genetic tests to evaluate whether the functional variation is of adaptive significance. In other cases, studies may be initiated by genome scans of DNA variation to identify candidate loci for recent adaptation. Results of such analyses can then motivate experimental efforts to test whether the identified candidate loci do in fact contribute to functional variation in some fitness-related phenotype. Functional studies can provide corroborative evidence for positive selection at particular loci, and can potentially reveal specific molecular mechanisms of adaptation.

Conclusive inferences about genetic adaptation ultimately require a mechanistic explanation of fitness differences among alternative genotypes. To ascertain the adaptive significance of polymorphism at a specific locus, it is necessary to document functional differences between the products of alternative alleles (e.g., catalytic properties of enzyme variants) and to document that the observed functional differences impinge on whole-organism performance in a way that affects fitness. Feder and Watt (1992) characterized the process of adaptive evolution as a recursion of four successive, pairwise links: (1) genes [RIGHTWARDS ARROW] design (the effect of allelic variants on a particular aspect of biological design); (2) design [RIGHTWARDS ARROW] performance (how design affects some ecologically relevant measure of performance such as locomotor capacity or thermal tolerance); (3) performance [RIGHTWARDS ARROW] fitness (how organismal performance affects individual fitness components or lifetime reproductive success); and finally, (4) fitness [RIGHTWARDS ARROW] genes (how variation in rates of survival and fecundity among different genotypes affects the genetic composition of the gene pool in the following generation). Noteworthy successes in establishing causal connections between successive links in this adaptive recursion come from studies of single-locus protein polymorphisms that have clearly defined effects on fitness-related measures of physiological performance (Gillespie 1991; Powers et al. 1991, 1993; Watt 1991; Eanes 1999; Watt and Dean 2000; Schulte 2001; Dean and Thornton 2007; Dalziel et al. 2009). The challenge of such studies is to document how functional polymorphism at individual genes translates into differences in quantitative trait values, which in turn translate into fitness differences that exceed the threshold at which the stochastic effects of genetic drift dominate the deterministic effects of natural selection.

In many cases, selection differentials that are large enough to account for long-term directional trends in phenotypic evolution may be too small to produce experimentally detectable variation in fitness among alternative genotypes in nature (Gillespie 1991; Lewontin 2002). However, because the cumulative effects of selection on genetically based trait variation may be expected to leave an imprint on levels and patterns of nucleotide variation at the underlying loci, analytical results of population genetic theory provide a basis for making indirect, historical inferences about positive selection and adaptive evolution. In principle, DNA sequence data integrate the effects of fitness variation over long periods of time and this has motivated the development of numerous statistical tests to detect the historical imprint of positive selection on patterns of DNA polymorphism and divergence.

In efforts to assess the adaptive significance of polymorphism at specific loci, there is much to be gained by integrating the direct, mechanistic insights provided by functional experiments with the indirect, retrospective insights provided by population genetic analysis. The two approaches are reciprocally illuminating. However, all too often, detecting the signature of positive selection in patterns of DNA sequence variation is regarded as an investigative endpoint in efforts to identify the genetic basis of adaptation. Instead, population-genetic evidence of positive selection at a specific locus should serve as a stepping-off point for experimental studies to elucidate the functional significance of the putatively adaptive variation.

In the broader enterprise of evolutionary biology, we argue that population-genetic tests of selection are more useful for generating functional hypotheses than for drawing firm conclusions about adaptation. If the results of population-genetic tests suggest that particular genetic changes may be of adaptive significance, it is then possible to design the appropriate experiments to measure the resultant effects on protein function (in the case of amino acid mutations) or protein expression (in the case of regulatory mutations or changes in gene copy number). In addition to the potential for providing corroborative evidence of positive selection, functional experiments are required to provide insights into specific genetic mechanisms of adaptation. These insights into mechanism may shed light on features of adaptive mutations (e.g., dominance, epistatic interactions, and pleiotropic effects) that can exert a strong influence on evolutionary trajectories. Moreover, identifying the specific mutations involved in adaptive changes and ascertaining the biochemical/physiological mechanisms by which they exert their effects on organismal fitness are critically important for testing theories about the genetics of adaptation (Phillips 2005).

Here, we discuss the merits of combining polymorphism-based tests of selection with functional analyses of natural variation to make inferences about adaptation at specific loci. We first discuss the role of functional approaches in efforts to identify genetic mechanisms of adaptation. We then review population genetic approaches for making retrospective inferences about positive selection and adaptation. Specifically, we review recent theoretical studies that have shed light on how the genetic architecture of adaptive traits affects our ability to infer a past history of positive selection at the underlying loci. These results reveal some of the limitations and interpretative challenges associated with polymorphism-based tests of selection, and point out the need for independent lines of evidence to make robust inferences about adaptation at specific loci. Finally, we highlight several recent studies that have successfully integrated population genetic analyses with functional studies to gain mechanistic insights into the process of adaptive evolution. We focus on the analysis of intraspecific polymorphism because comparisons among actually or potentially interbreeding individuals provide the greatest experimental power for assessing genotypic differences in performance and fitness (Bock 1980; Arnold 1983). Furthermore, the study of population-level variation, especially in a geographic context, provides the empirical and theoretical foundation for much of our understanding of microevolution as a process. As stated by Arnold (1981:510): “… geographic variation is ordinarily the smallest amount of evolution that can be detected in nature and … evolutionary theory, in its strongest form, applies only to small evolutionary change.”

Functional Inferences About Adaptation at Specific Loci

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED

At the most proximal level of phenotypic variation, the product of a particular allele can be considered a mechanism of adaptation if its biochemical properties alter a particular physiological or developmental pathway in a manner that leads to increased fitness under a particular set of environmental circumstances (e.g., low temperature, low oxygen availability, etc.). The challenge of documenting mechanistic connections between gene function and fitness can be illustrated by considering a hypothetical example of how functional polymorphism at two or more enzyme-encoding genes contribute to adaptive variation in physiological performance (Koehn et al. 1983). The simplest case is shown in Figure 1A, where alternative alleles at gene A have different catalytic efficiencies (=biochemical phenotypes) that translate into differences in physiological performance, which in turn translate into fitness differences. Functional variation in the enzyme encoded by gene A is unaffected by genetic background; there is no epistasis with other loci such as gene B, which encodes an enzyme that catalyzes a related reaction in the same metabolic pathway. Consequently, there is a simple mapping of variation at gene A to biochemical phenotype, physiological performance, and ultimately, fitness. Although certainly oversimplified, the scheme illustrated in Figure 1A may provide a reasonable approximation to several empirical case studies of protein polymorphism where alternative single-locus genotypes have measurably distinct effects on fitness-related physiological performance under natural conditions (Eanes 1999). Some of the most celebrated examples include studies of lactate dehydrogenase polymorphism in the killifish, Fundulus heteroclitus, aminopeptidase I polymorphism in the marine bivalve, Mytilus edulis, glutamate-pyruvate transaminase polymorphism in the intertidal copepod, Tigriopus californicus, alcohol dehydrogenase and glucose-6-phosphate dehydrogenase polymorphism in Drosophila melanogaster, and phosphoglucose isomerase polymorphism in Colias butterflies (reviewed by Koehn et al. 1983; Zera et al. 1985; Gillespie 1991; Powers et al. 1991; Watt 1991; Mitton 1997; Eanes 1999; Schulte 2001). In many or most of these cases, the simplicity may be more apparent than real, and additional layers of complexity would likely be revealed by a more thorough dissection of the physiological pathways that underlie the observed trait variation. Nonetheless, case studies of enzyme polymorphism in killifish, bivalves, copepods, fruitflies, butterflies, and a number of other organisms demonstrate the feasibility of establishing mechanistic connections between genotype, phenotype, and specific fitness components under natural conditions.

image

Figure 1. A hypothetical example that illustrates the challenge of documenting mechanistic connections between allelic variation in gene function and fitness. (A) The simplest case in which alternative alleles at an enzyme-encoding gene A have different catalytic efficiencies (=biochemical phenotypes) that translate into differences in physiological performance (=physiological phenotype), which in turn translate into fitness differences. Functional variation in the enzyme encoded by gene A is unaffected by genetic background; there is no epistasis with other loci such as gene B. Consequently, there is a simple mapping of variation at gene A to biochemical phenotype, physiological performance, and ultimately, fitness. In the example shown in (B), the molecular basis of fitness variation is slightly more complex as the biochemical phenotype is determined by the epistatic interaction between the enzymes encoded by genes A and B. Nonetheless, each two-locus genotype is associated with a distinct phenotype that ultimately determines its fitness ranking. In the example shown in (C), the performance phenotype is continuously distributed due to contributions of many genes of individually small effect. In such cases, it will be far more difficult to relate variation at specific genes to specific fitness rankings. Modified from Koehn et al. (1983).

Download figure to PowerPoint

Because enzymes do not function in isolation but are enmeshed within complex metabolic networks, the scheme illustrated in Figure 1B may represent a slight improvement in biological realism. In this case, the molecular basis of fitness variation is slightly more complex as the biochemical phenotype is determined by the epistatic interaction between enzymes encoded by genes A and B. Nonetheless, each two-locus genotype is associated with a distinct phenotype that ultimately determines its fitness ranking. A good empirical example of this scheme is provided by enzyme polymorphisms in the pentose shunt pathway of D. melanogaster, where fitness differences between alternative G6pd genotypes depend on activity variation at the 6-Pgdh gene (Cavener and Clegg 1981; Eanes 1984). Similarly, fitness differences between kinetically distinct 6-Pgdh allozymes of E. coli are only manifest in a genetic background that does not provide alternative pathways for metabolizing 6-phosphogluconate (Dykhuizen and Hartl 1980; Hartl and Dykhuizen 1981). Despite the added complexity and biological realism illustrated in Figure 1B, this scheme is surely a grossly oversimplified depiction of the genetic architecture of most fitness-related traits. In the case of many behavioral, physiological, and morphological traits, it may be rare to find any single locus that explains more than a small fraction of the observed variance in trait values, and the phenotypic effects of individual loci may often depend on the allelic state at many other loci. When the more distal physiological phenotype is continuously distributed due to contributions of many genes of individually small effect, as illustrated in Fig. 1C, it will be far more difficult to relate variation at specific genes to specific fitness rankings.

Although it may often be exceedingly difficult to measure fitness variation among alternative single-locus genotypes under natural conditions, it will typically be much more feasible to document causal connections between genotype and phenotype, at least at the level of proximal biochemical phenotypes. The existence of fitness differences among alternative genotypes obviously requires allelic differences in gene function and/or gene dosage. The important point is that these allelic differences often can be measured even if their net effects on organismal fitness lie below the detection limits of our experimental methods.

Population Genetic Inferences About Adaptation at Specific Loci

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED

Population genetic analyses can provide valuable retrospective insights into the role of selection and other evolutionary processes in shaping observed patterns of DNA sequence variation at specific loci. One popular class of methods for detecting a history of positive selection in protein-coding genes involves a comparison of polymorphism and/or divergence at synonymous and nonsynonymous sites (Yang and Bielawski 2000; Fay et al. 2001, 2002; Bustamante et al. 2005; Nielsen 2005; Nielsen et al. 2005a; Eyre-Walker 2006; Jensen et al. 2007a; Li et al. 2008). These methods are useful for identifying adaptive changes that involve recurrent amino acid substitutions, but they are less useful for identifying adaptive modifications of protein function that involve a small number of substitutions. This class of methods is also generally less applicable to the analysis of noncoding sequences, but in principle it can be extended to any comparison between discrete site classes that are interleaved in the same sequence (Wong and Nielsen 2004; Andolfatto 2005; Hahn 2007).

Population genetic theory indicates that the effects of a single adaptive substitution can be detected using polymorphism data, provided that the substitution occurred within a time interval defined by a small multiple of Ne generations from the present (where Ne is the effective size of the total population). When a newly arisen mutation is driven to fixation by positive selection, variation at closely linked sites may be dramatically reduced because some fraction of the ancestral haplotype in which the mutation originated will also become fixed. The strength of this “hitchhiking effect” is determined by the ratio of the local recombination rate to the selection coefficient and the time to fixation (Maynard-Smith and Haigh 1974; Kaplan et al. 1989). In addition to the locally reduced level of nucleotide polymorphism, the fixation of an adaptive allele at one site is generally expected to produce a transient increase in the level of linkage disequilibrium (LD) and a skew toward low- and high-frequency derived variants at linked neutral sites (Stephan et al. 1992, 2006; Fay and Wu 2000; Kim and Stephan 2002; Kim and Nielsen 2004; Innan 2006; Jensen et al. 2007b; McVean 2007), although these predicted effects on the site-frequency distribution do not necessarily apply in the case of recurrent fixations (Wiehe and Stephan 1993; Przeworski 2002; Li and Stephan 2006; Andolfatto 2007; Macpherson et al. 2007; Jensen et al. 2008). Other distinct patterns of nucleotide variation and LD are characteristic of ongoing or partial selective sweeps, where the adaptive variant is still segregating at intermediate frequency in the population, or geographically restricted sweeps associated with local adaptation (Sabeti et al. 2002, 2007; Weir et al. 2005; Voight et al. 2006; Wang et al. 2006; Barreiro et al. 2008; Innan and Kim 2008; Coop et al. 2009; Excoffier et al. 2009; Pickrell et al. 2009). These characteristic signatures of positive selection can provide suggestive evidence that a given gene has contributed to recent adaptive changes, and can also provide the basis for screening genome-wide patterns of DNA polymorphism to identify candidate loci for adaptation at the level of local populations or the species as a whole. With fine-scale geographic sampling in a well-defined ecological and historical context, it is also possible to detect cases of spatially varying selection that involve subtle shifts in allele frequencies among locally adapted populations (Hancock et al. 2008; Coop et al. 2009; Novembre and Di Rienzo 2009). The scenarios involving local adaptation will typically provide the best opportunities for integrating experimental studies of natural variation because alternative functional variants are still segregating.

INTERPRETATIVE CHALLENGES ASSOCIATED WITH POLYMORPHISM-BASED TESTS OF SELECTION

The original genetic hitchhiking model that provides the basis for standard polymorphism-based tests of selection assumes that variation at linked sites is affected by uniform directional selection on newly arisen, co-dominant mutations that arise in a panmictic population of constant size (Maynard-Smith and Haigh 1974; Kaplan et al. 1989; Stephan et al. 1992; Braverman et al. 1995; Barton 1998). Because certain demographic histories may produce patterns of DNA sequence polymorphism that are indistinguishable from the predicted effects of positive selection (Andolfatto and Przeworski 2000; Ramos-Onsins and Rozas 2002; Wall et al. 2002; Jensen et al. 2005; Williamson et al. 2005; Teshima et al. 2006; Nielsen et al. 2007; Ramirez-Soriano et al. 2008), departures from equilibrium assumptions can have a significant impact on the false-discovery rate of polymorphism-based neutrality tests (i.e., tests are significant even though the observed effects are attributable to nonselective factors). Population structure can also confound inferences about selection in complex ways that are highly dependent on the sampling scheme (Slatkin and Wiehe 1998; Wakeley and Aliacar 2001; Przeworski 2002; Santiago and Caballero 2005).

It is also important to consider the possibility that neutrality tests may only succeed in detecting signatures of positive selection at a nonrandom subset of loci that have actually contributed to adaptation (Teshima et al. 2006; Teshima and Przeworski 2006; Chevin and Hospital 2008). The concern is that the subset of traits that are implicated by such tests may have an unrepresentative genetic architecture. In cases in which adaptation involves selection on standing variation at many different genes, recent theoretical studies have shown that the ability to detect the signature of positive selection at any given locus depends on the dominance coefficient of the causative allele (Teshima and Przeworski 2006), the initial frequency of the causative allele at the onset of selection (Orr and Betancourt 2001; Innan and Kim 2004; Hermisson and Pennings 2005; Przeworski et al. 2005; Pennings and Hermisson 2006; Teshima et al. 2006; Barrett and Schluter 2008), and the background genetic variance of the selected trait (Kelly 2006; Chevin and Hospital 2008). Below we briefly summarize these theoretical results and discuss their implications for using population-genetic tests of selection to infer adaptation at specific loci.

SELECTION ON STANDING VARIATION

Adaptation from standing variation occurs when pre-existing mutations (present at a frequency >1/2N) that were initially neutral or mildly deleterious suddenly become advantageous due to a change in the environment or the genetic background. Because the newly adaptive mutation may have been segregating for some time prior to the onset of the new selection regime, the mutation would have had the opportunity to recombine onto multiple haplotype backgrounds. Haplotypes that harbor copies of the newly adaptive mutation will increase in frequency with the onset of selection, and the nucleotide variants that distinguish these different haplotypes will ultimately attain intermediate frequencies in the population. Thus, the fixation of a pre-existing mutation may often drag along more ancestral variation at closely linked sites than will the fixation of a new mutation. Relative to a standard selective sweep, directional selection on standing variation will typically produce a less-pronounced reduction in levels of polymorphism at linked sites (Innan and Kim 2004; Hermisson and Pennings 2005; Przeworski et al. 2005). Importantly, selection on standing variation does not simply produce a weaker, less distinct version of the signature predicted by the standard sweep model. Results of Przeworski et al. (2005) demonstrate that, for intermediate frequencies that may be characteristic of alleles that were previously neutral or mildly deleterious, positive directional selection produces an increased variance in the frequency distribution of polymorphic sites and an increased variance in measures of LD relative to the expectations of the standard selective sweep model. For this reason, loci that have contributed to adaptation from standing variation may exhibit patterns of variation that are indistinguishable from neutral patterns. Consequently, such loci will not necessarily appear as outliers in empirical, genome-wide distributions of summary statistics based on allele frequency distributions or levels of intralocus LD (Teshima et al. 2006).

SELECTION ON POLYGENIC VARIATION

Marker-based mapping approaches such as QTL mapping and association mapping are now providing increasingly detailed information about the genetic basis of ecologically important traits. Once a causative gene for some putatively adaptive trait has been positionally cloned, the analysis of DNA sequence polymorphism in and around the gene can potentially shed light on the evolutionary forces that have shaped the observed patterns of trait variation. In the case of polygenic traits that are involved in adaptation, theoretical results suggest that it may be unrealistic to expect to observe the canonical signature of positive selection at the underlying QTL. This is true in cases involving directional selection at the total population level (Chevin and Hospital 2008) as well as cases involving divergent selection between spatially separated subpopulations that inhabit contrasting environments (Latta 1998; Le Corre and Kremer 2003; Storz 2005; Kelly 2006).

If mutations at many different loci contribute to the response to selection on a given trait, then the mean trait value of the population can be shifted closer to the phenotypic optimum without causing significant allele frequency changes at any of the underlying QTL (Kimura and Crow 1978; Lande 1983). Despite the unavoidable dilution of effect among multiple QTL, it might seem reasonable to hold out hope that adaptive changes in polygenic traits may be reflected by detectable signatures of positive selection at genes of major effect, or “leading QTL.” However, recently developed theory demonstrates that the selection coefficient that determines the trajectory of an adaptive mutation cannot be related in a simple manner to its own QTL effect (the fraction of genetic variance in trait values that it explains). This is because the trajectory of an adaptive mutation is strongly influenced by the mean and variance of the genetic background for the selected trait, and by the intensity and mode of selection on the trait (Chevin and Hospital 2008). In the case of a Gaussian fitness function (e.g., stabilizing selection on a trait that has been displaced from the phenotypic optimum by a change in the environment), the background genetic variance of the selected trait has a strong negative impact on the probability of a complete selective sweep at any given QTL because it allows the population to reach the current phenotypic optimum without fixing new mutations. In the case of both Gaussian and linear fitness functions, background genetic variance of the selected trait also causes the selection coefficient of individual mutations to decrease as a function of time. This has important implications for making inferences about the strength and timing of positive selection because a selective sweep involving an adaptive mutation with a decreasing selection coefficient will generally appear to be older than is actually the case (Chevin and Hospital 2008). Thus, in comparison with population genomic methods that are designed to detect complete sweeps, methods that are designed to detect partial or ongoing sweeps (e.g., Voight et al. 2006) may be better suited to detecting the effects of selection at individual loci that contribute to polygenic adaptation.

The effects of spatially varying selection on polygenic traits also pose challenges for identifying loci involved in local adaptation. Consider a polygenic trait that is characterized by different phenotypic optima in spatially separated environments, such that divergent selection pulls the population trait means toward different adaptive peaks. In comparisons between locally adapted subpopulations, the intuitive expectation is that loci that underlie the divergently selected trait will exhibit levels of differentiation that far exceed the genome-wide average. However, even if the underlying genes have purely additive effects, adaptive differentiation in trait values is caused by the between-population variance in allelic effects at individual genes as well as the covariances in allelic effects among genes. With strong divergent selection and high gene flow, adaptive differentiation in trait values may occur primarily as a result of the covariance in allele frequencies among contributing loci (the between-population component of LD) even in the absence of appreciable shifts in allele frequency at individual loci (Latta 1998; Le Corre and Kremer 2003). Positive covariances in allele frequencies among loci will develop whenever the spatial variance in phenotypic optima exceeds the expected variance in trait values under migration–drift equilibrium (Latta 1998).

The take-home message is that adaptive differentiation of polygenic traits does not necessarily produce elevated FST values at the underlying QTL. For example, a survey of geographic variation in phenological traits of European aspen (Populus tremula) revealed high levels of latitudinal differentiation in traits related to growth cessation and dormancy induction (Hall et al. 2007). Interestingly, this adaptive differentiation in trait values across the latitudinal gradient was not associated with elevated levels of nucleotide differentiation at phenology candidate genes, including phytochrome B2, which is associated with variation in the timing of growth cessation (Ingvarsson et al. 2008; Ma et al. 2010). FST values for nucleotide polymorphisms in these candidate genes fell within the range of values for unlinked neutral markers and would not have been detected as outliers in a genome scan of DNA polymorphism between locally adapted populations (Ma et al. 2010).

In cases where spatially varying selection maintains alternative alleles as a balanced polymorphism, each of the alternative allele classes will accumulate its own set of neutral mutations at linked sites. Thus, at the total population level, the general expectation is that divergently selected loci will be characterized by an excess of intermediate-frequency polymorphisms (Tajima 1989; Nordborg and Innan 2003) and an excess of intralocus LD (Kelly 1997). In the case of polygenic traits, however, genetic redundancy may permit selection to maintain constant multilocus genotypic values (and hence, constant trait values) despite a continual turnover of functionally interchangeable alleles at the underlying genes. This turnover will generate a much younger age distribution of neutral mutations at linked sites relative to the case in which individual alleles are maintained by selection (i.e., when alleles of the same functional type are identical-by-descent). Consequently, in cases where a polygenic trait is subject to spatially varying selection or other forms of diversity-enhancing selection, the underlying genes will not necessarily exhibit the patterns of nucleotide variation that neutrality test statistics are designed to detect (Kelly 2006).

Considering the potential difficulties associated with the detection of QTL that contribute to adaptive changes in polygenic traits, it is probably no accident that the genes exhibiting the most clear-cut signatures of positive selection in human populations affect traits that are characterized by relatively low levels of background genetic variance. These include the LCT gene that is associated with adult lactase persistence (Bersaglieri et al. 2004; Enattah et al. 2007; Tishkoff et al. 2007), as well as a number of genes that confer partial resistance to malaria, including G6pd (Tishkoff et al. 2001; Sabeti et al. 2002; Saunders et al. 2002; Verrelli et al. 2002), the adult β-globin gene, HBB (Currat et al. 2002; Ohashi et al. 2004), and the Duffy blood group gene, DARC (Hamblin and Di Rienzo 2000; Hamblin et al. 2002).

Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED

Dean and Thornton (2007) reviewed a number of case studies that employed manipulative biochemical experiments of specific gene products to gain insight into the mechanistic basis of adaptive evolution. In some cases, allelic differences in protein function or protein expression may have measurable phenotypic effects at the level of whole-organism performance. This opens up the possibility of measuring fitness-related trait variation among individuals with known genotypes under natural conditions. Although such studies face the challenge of controlling for genetic background to isolate the effects of specific genes, the potential pay-off is that it may be possible to establish direct, mechanistic connections between whole-organism performance and fitness in an ecologically relevant context.

Below we highlight several case studies that have integrated population genetic analyses of DNA sequence polymorphism with functional studies of specific gene products and/or whole-organism phenotypes to gain insight into the mechanistic basis of adaptation. There are a number of relevant case studies that could be reviewed in this context (for additional examples, see Eanes 1999; Watt and Dean 2000; Dean and Thornton 2007; Dalziel et al. 2009). We have chosen to highlight a few select examples in which evidence for selection on naturally occurring variation can be interpreted within a clear ecological context. In each of these case studies, the specific genes under investigation were chosen for study based on their known or predicted effects on fitness-related measures of organismal performance. Each of these studies succeeds in illuminating different pairwise links in the adaptive recursion, and each provides reasonably detailed (but still incomplete) descriptions of potential mechanisms of adaptation.

PGI POLYMORPHISM, FLIGHT PERFORMANCE, AND DISPERSAL ABILITY IN BUTTERFLIES

Metabolic performance can affect organismal fitness through the allocation of metabolic currencies to competing bioenergetic demands such as growth and reproduction (Watt 1986). Measures of metabolic performance that may have fitness consequences include the rate of pathway flux and the efficiency of flux (i.e., productivity of the pathway relative to the energy demands of maintenance metabolism; Clark and Koehn 1992; Watt and Dean 2000). The study of phosphoglucose isomerase (PGI) polymorphism in alfalfa butterflies (Colias eurytheme) provides a classic example of adaptive genetic variation in metabolic performance, and provides a mechanistic description of how differences in metabolic performance translate into fitness differences under natural conditions (Watt 1977, 1983; Watt et al. 1983).

In Colias butterflies, genetic variation at PGI affects the transient-state resupply of ATP to flight muscle via glycolysis (Watt 1977, 1983; Watt and Dean 2000). Natural populations of C. eurytheme segregate multiple electophoretically distinguishable PGI alleles, and the various diploid genotypes vary several-fold in measures of enzyme function such as Vmax/Km, where Vmax is the maximum velocity of the enzyme-catalyzed reaction under substrate-saturating conditions, and Km is the Michaelis rate constant for the reaction. The three most common PGI genotypes (3/3, 3/4, and 4/4) exhibit significant differences in enzyme kinetics at temperatures that fall below the thermal optimum for butterfly flight (35–39°C; Watt 1983). At suboptimal temperatures, 3/4 heterozygotes exhibit elevated Vmax/Km relative to the 3/3 and 4/4 homozygotes. Importantly, PGI enzymes of the two alternative homozygotes are not equal due to a trade-off between catalytic activity and thermal stability. As predicted by the observed genotypic differences in biochemical performance, PGI 3/4 heterozygotes are capable of maintaining active flight during a larger fraction of the daily thermal cycle, followed closely by 3/3 and then 4/4 (Watt et al. 1983, 1996). These effects of PGI genotype on flight capacity are expected to have important consequences for both female fecundity (because females lay eggs one at a time and are actively flying between each successive egg-laying) and male mating success (by determining opportunities for mating with flight-active females). Consistent with predictions based on metabolic performance, measures of female fecundity in the field revealed a rank-order of PGI genotypes that was concordant with the rank-order of Vmax/Km values: 3/4 > 3/3 ≫ 4/4 (Watt 1992). Field studies of male mating success exploited the phenomenon of sperm precedence, where the last male to mate with a given female fertilizes all the eggs (Boggs and Watt 1981; Carter and Watt 1988). Similar to the documented rank-order of PGI genotypes with respect to female fecundity, the distribution of PGI genotypes in progeny arrays revealed an overrepresentation of kinetically effective genotypes among contributing sires relative to all males in the general population (Watt et al. 1985, 1986). In summary, there is a striking concordance between the rank-order of biochemical performance among PGI genotypes and field-based measurements of survival and reproductive success in both males and females (Watt 2003).

Surveys of DNA sequence variation at the Pgi gene revealed the specific amino acid substitutions that distinguish the kinetically distinct electromorphs, and homology-based modeling of the PGI protein structure provided insights into the structural mechanism that may be responsible for the overdominance of enzyme kinetics (Wheat et al. 2006). The primary candidate sites for molecular adaptation are a pair of charge-changing amino acid substitutions that are located in an exterior, interhelical loop that is positioned at the interface between subunits of the dimeric enzyme (Fig. 2A). These amino acid changes may alter subunit interactions and/or stereochemistry of the catalytic center (Wheat et al. 2006). Consistent with a causal role for these two substitutions, a sliding window analysis revealed a significant excess of intermediate-frequency silent-site polymorphisms in the specific region of exon 9 that harbors both nonsynonymous changes (Fig. 2B), a pattern suggestive of long-term balancing selection.

image

Figure 2. Polymorphic amino acid residues in the PGI enzyme of Colias eurytheme, and variation in site-frequency spectra across the coding region of the underlying gene. (A) Homology-based model of a single PGI monomer (green) showing segregating amino acid sites (369 and 375) located in an exterior interhelical loop that is positioned at the intersubuint interface of the dimeric enzyme. This region is part of a peptide chain (yellow) that connects active site residues Glu 361 and His 392. (B) A sliding window analysis of Tajima's D based on synonymous site polymorphism across exons of the Pgi gene in C. eurytheme. Codons 369 and 375 are located in exon 9 (step size = 25bp and window length = 70bp, which is half the average exon length). Modified from Wheat et al. (2006).

Download figure to PowerPoint

In addition to evolutionary and functional evidence for the adaptive maintenance of PGI polymorphism in C. eurytheme and other Colias species (Watt et al. 1996; Watt 2003), recent field-based studies of the Glanville fritillary butterfly, Melitaea cinxia, have also revealed evidence for balancing selection on the Pgi gene. Melitaea cinxia is common throughout Eurasia, and it has been intensively studied in the Åland archipelago in Finland where it persists as a classic metapopulation. Dispersal among habitat patches plays a key role in maintaining the viability of the metapopulation as a whole, as extinction–recolonization dynamics result in a rapid turnover of local demes (Hanski 1999; Hanski and Ovaskainen 2000; Nieminen et al. 2004; Ovaskainen and Hanski 2004).

The study of PGI polymorphism in the M. cinxia metapopulation has provided a number of important insights into the genetic basis of fitness variation under natural conditions. First, PGI variation is associated with flight capacity, female fecundity, and survivorship (Haag et al. 2005; Saastamoinen 2007; Klemme and Hanski 2009; Niitepõld et al. 2009; Orsini et al. 2009; Saastamoinen et al. 2009). Second, PGI genotypes associated with increased flight capacity and increased fecundity are also found at a higher frequency in isolated, newly established demes relative to older demes. Third, variation in PGI allele frequencies among demes is also correlated with variation in deme growth rates, which is a measure of realized fitness (Hanski and Saccheri 2006). Thus, variation in flight performance and fitness variation among PGI genotypes has ramifying effects on metapopulation dynamics, illustrating how studies of evolutionary mechanism can enrich our understanding of ecological process (Saccheri and Hanski 2006).

Consistent with evidence for overdominant selection on enzyme function in contemporary populations of M. cinxia (Haag et al. 2005; Hanski and Saccheri 2006; Niitepõld et al. 2009; Orsini et al. 2009), population genetic analysis of DNA sequence polymorphism at the Pgi gene revealed strong evidence for long-term balancing selection. The inferred age of the most recent common ancestor of the two main Pgi alleles within the Åland population predates the divergence times of five extant Melitaea species (Wheat et al. 2010). In both Colias and Melitaea, overdominant selection on PGI enzyme function appears to stem from the ability of heterozygotes to maintain flight activity over a broader range of ambient air temperatures (Watt et al. 1983, 1985, 1996; Watt 1992; Saastamoinen and Hanski 2008; Niitepõld et al. 2009). The documented genotype × temperature interaction on flight capacity of M. cinxia suggests that the PGI polymorphism of this species may be characterized by the same trade-off between catalytic activity and thermal stability that has been documented for the PGI polymorphism in Colias (Saastamoinen and Hanski 2008; Niitepõld et al. 2009). In addition, comparative analysis of Pgi structures in M. cinxia and C. eurytheme suggests that a similar structural mechanism may underlie overdominance of enzyme kinetics and metabolic performance in both species (Wheat et al. 2010). The survey of nucleotide variation in the Pgi gene of M. cinxia revealed that the two most common allele classes were distinguished by a pair of charge-changing amino acid substitutions in the same interhelical loop domain identified in C. eurytheme butterflies, although the specific sequence changes were different in each species. A detailed investigation of PGI enzyme kinetics of M. cinxia will be necessary to determine whether a similar mechanism underlies the apparent overdominance of metabolic performance in both Colias and Melitaea.

HEMOGLOBIN POLYMORPHISM AND AEROBIC METABOLISM OF HIGH-ALTITUDE DEER MICE

The deer mouse (Peromyscus maniculatus) has the broadest altitudinal distribution of any North American mammal, and therefore represents an ideal study organism for investigating mechanisms of physiological adaptation to different elevational zones. Deer mice are remarkably abundant in alpine environments at elevations up to 4300 m, where the partial pressure of O2 (PO2) is ∼60% of the sea level value. At such altitudes, the reduced PO2 of inspired air results in a reduced O2 saturation of arterial blood (hypoxemia), which in turn leads to a reduced supply of O2 to the cells of aerobically metabolizing tissues. In the absence of other compensatory physiological adjustments, this hypoxia-induced hypoxemia can impose severe constraints on aerobic metabolism and may therefore influence an animal's food and water requirements, the capacity for sustained locomotor activity, and the capacity for internal heat production.

Evidence from a number of vertebrate species indicates that physiological adaptation to high-altitude hypoxia often involves fine-tuned adjustments in blood-O2 affinity (Storz and Moriyama 2008). Under conditions of extreme hypoxia when pulmonary O2 loading is at a premium, an increased blood-O2 affinity helps maximize the level of tissue oxygenation for a given difference in PO2 between the sites of O2 loading in the pulmonary capillaries and the sites of O2 unloading in the tissue capillaries. Studies of deer mice have demonstrated that the divergent fine-tuning of blood-O2 affinity plays an important role in adaptation to different elevational zones (Storz 2007). Surveys of natural variation revealed that blood-O2 affinity is positively correlated with the native altitude of different deer mouse subspecies (Snyder 1981, 1985; Snyder et al. 1982, 1988), and physiological studies of wild-derived strains of deer mice revealed that this variation in blood biochemistry is strongly associated with allelic variation at two tandemly duplicated genes that encode the α-chain subunits of adult hemoglobin (Hb; Chappell and Snyder 1984; Chappell et al. 1988).

Remarkably, phenotypic effects of this two-locus α-globin polymorphism are also manifest at the level of whole-animal physiological performance, as measured by maximal rates of O2 consumption (VO2max) elicited by aerobic exercise or cold exposure. Both measures of aerobic power output exhibited consistent variation among strains of mice with different α-globin genotypes: mice with the high-affinity α-globin genotype exhibited a higher VO2max when tested under hypoxic conditions, whereas mice with the low-affinity genotype exhibited a higher VO2max when tested under normoxic conditions at sea level, and double heterozygotes were characterized by intermediate measures of aerobic performance under both treatments (Chappell and Snyder 1984; Chappell et al. 1988). This genetically based variation in VO2max can be expected to have important fitness consequences in alpine environments because mice that are capable of attaining a higher VO2max under hypoxia can maintain a constant body temperature by means of aerobic thermogenesis at lower ambient temperatures. Physiological studies have revealed that high-altitude deer mice are often operating close to their aerobic performance limits (Hayes 1989a, b), and even during summer months, thermogenic demands associated with low nighttime temperatures are sufficient to outstrip VO2max (Hayes and O’Connor 1999). Because VO2max is impaired under hypoxic conditions (Rosenmann and Morrison 1975; Chappell et al. 2007), small endotherms such as deer mice face a double bind as their thermogenic capacity is compromised under conditions in which thermoregulatory demands are especially severe.

Consistent with these expected effects on fitness, a survivorship study of high-altitude deer mice in the White Mountains of eastern California revealed strong directional selection on thermogenic capacity (Hayes and O’Connor 1999). In one season that was characterized by especially high spring snow-melt, the magnitude of the standardized, directional selection gradient on thermogenic capacity (0.523) represents one of the largest linear selection gradients ever measured in a free-ranging vertebrate species (Kingsolver et al. 2001). The results of Hayes and O’Connor (1999) indicate that, during periods of extreme cold, the average survivor would have been able to stave off hypothermia at air temperatures 1–2°C lower than the average nonsurvivor.

This system represents a rare case in which it has been possible to establish a mechanistic connection between allelic variation in protein function and fitness-related variation in whole-animal physiological performance. More recently, physiological studies of highland and lowland deer mice documented that variation in Hb-O2 affinity is also strongly associated with allelic variation at two tandemly duplicated genes that encode the β-chain subunits of the Hb tetramer (Storz et al. 2009, 2010). These causal connections between genotype, phenotype, and fitness have helped to illuminate the role of natural selection in shaping altitudinal variation in allele frequencies. Patterns of nucleotide diversity and LD at the two duplicated α-globin genes (HBA-T1 and HBA-T2) and the two duplicated β-globin genes (HBB-T1 and HBB-T2) are clearly indicative of local adaptation to different elevational zones (Storz et al. 2007, 2009; Storz and Kelly 2008). The divergent fine-tuning of Hb-O2 affinity appears to be attributable to the combined effects of eight amino acid mutations in the α-chains and four amino acid mutations in the β-chains, yielding a total of 12 candidate sites for molecular adaptation (Fig. 3; Storz et al. 2010).

image

Figure 3. Homology-based structural model of deer mouse hemoglobin (Hb) showing the location of 12 polymorphic amino acid sites that exhibit allele frequency differences between high- and low-altitude populations. Mutations in the α- and β-chain subunits are shown in panels A and B, respectively. These represent candidate sites for the adaptive fine-tuning of Hb-O2 affinity between highland and lowland populations. Based on data reported in Storz et al. (2009, 2010).

Download figure to PowerPoint

Measurements of O2-equilibrium curves of purified Hbs revealed that allelic differences in Hb-O2 affinity were attributable to a suppressed sensitivity to the inhibitory effects of 2,3-diphosphoglycerate (DPG) and Cl ions, allosteric cofactors that preferentially bind to sites in deoxyHb (Storz et al. 2009, 2010). Because the binding of DPG and Cl ions helps stabilize the low-affinity deoxyHb quaternary structure, a suppressed sensitivity to both cofactors results in an increased O2 affinity by shifting the allosteric equilibrium in favor of the high-affinity oxyHb conformation. Comparisons between matched pairs of high- and low-altitude mice with Hbs containing the same α-chains but different β-chains revealed that the suppressed DPG sensitivity was associated with the two-locus β-globin haplotype that predominates in the high altitude population (Fig. 4). This important physiological property is therefore attributable to the independent or joint effects of four amino acid mutations that distinguish the alternative β-globin alleles.

image

Figure 4. O2-equilibrium curves of deer mouse hemoglobins showing allelic differences in Hb-O2 affinity. (A) Curves for high-altitude mice that express the βII Hb isoform (product of the d1β-globin allele) in the presence and absence of allosteric cofactors (2,3-DPG and Cl ions); (B) Curves for low-altitude mice that express the βI Hb isoform (product of the d0β-globin allele) under the same experimental treatments; (C) Summary of allelic differences in O2 affinity and cofactor sensitivity between the β-chain Hb isoforms of high- and low-altitude mice (P50 is the PO2 at which Hb is 50% saturated). Based on data reported in Storz et al. (2009).

Download figure to PowerPoint

The discovery that allelic differences in anion sensitivity contribute to the adaptive fine-tuning of Hb-O2 affinity illustrates the value of integrating evolutionary analyses of sequence variation with mechanistic appraisals of protein function. The population genetic analysis revealed evidence that the observed patterns of β-globin polymorphism have been shaped by a history of divergent selection between elevational zones, and this result motivated experimental investigations into the functional significance of the allelic variation. Experimental measures of O2-binding properties corroborated the tests of selection by demonstrating a functional difference between the products of alternative alleles.

EDA POLYMORPHISM AND ARMOR PLATING IN FRESHWATER STICKLEBACKS

The parallel loss of armor plating in multiple, independently derived freshwater populations of threespine stickleback fish (Gasterosteus aculeatus) has been the focus of a highly integrative, multiteam research enterprise that has yielded important insights into the genetic basis of morphological evolution (Colosimo et al. 2004, 2005; Cresko et al. 2004; Hohenlohe et al. 2010), as well as the complex chain of causal relationships between genotype, phenotype, and fitness (Barrett et al. 2008). Since the end of the last glacial maximum, threespine sticklebacks have independently colonized multiple postglacial, freshwater lake and stream systems in the Northern Hemisphere. These freshwater populations have consistently evolved a reduction in the number of external bony lateral plates relative to their marine counterparts. Comparative mapping experiments implicated the same major-effect QTL in the parallel loss of armor plating in multiple populations (Colosimo et al. 2004; Cresko et al. 2004) and subsequent fine-mapping and transgenic experiments suggested a causal role for the derived “low” allele of the Ectodysplasin-A (Eda) gene (Colosimo et al. 2005). Population surveys of nucleotide polymorphism in the Eda gene revealed that the low allele is present at low frequency in the ancestral marine population of sticklebacks, phylogenetic analysis suggested that the age of the low allele vastly predates the postglacial colonization of freshwater habitats, and a high-density SNP-based genome scan revealed that the Eda gene region has been subject to positive directional selection in multiple, independently derived freshwater populations (Colosimo et al. 2005; Hohenlohe et al. 2010). These lines of evidence indicate that the parallel evolution of reduced armor plating in different freshwater populations has been driven by repeated selection on standing genetic variation.

Armed with information about the phenotypic effects of alternative Eda alleles, Barrett et al. (2008) conducted a transplantation experiment to obtain direct measurements of fitness variation among alternative Eda genotypes under natural conditions. The experiment was designed to test the hypothesis that reduced armor plating confers a fitness advantage in freshwater sticklebacks by permitting a reallocation of energy toward juvenile growth. An increased rate of juvenile growth appears to enhance a stickleback's prospects for overwinter survival and early reproduction in the following spring. Barrett et al. (2008) phenotyped thousands of marine sticklebacks from coastal British Columbia to identify partially plated individuals that were likely to be heterozygous for the low allele and the wild-type “complete” allele. After using DNA markers to determine the Eda genotypes of wild-caught fish, a total of 182 low/complete heterozygotes were then transplanted to replicate ponds and Eda genotype frequencies were monitored over the course of a full annual cycle (=1 stickleback generation). As predicted, the low Eda allele was associated with higher rates of juvenile growth and overwinter survival, and over the course of the annual cycle the low allele underwent a parallel net increase in frequency across each of the replicate ponds. Surprisingly, however, the increase in net frequency of the low allele was primarily attributable to overdominance of fitness at the Eda gene after the fish had attained the final adult number of lateral plates, roughly midway through the annual cycle (Fig. 5). During the earlier stage of ontogenetic development when plate number was not yet finalized, low/complete Eda heterozygotes actually had lower fitness than either of the alternative homozygotes.

image

Figure 5. Temporal changes in allele and genotype frequencies at the Eda gene in four replicate freshwater populations of threespined stickleback. (A) Changes in frequency of the “low plated”Eda allele in four replicate ponds (different colored lines). All samples are from the first (F1) cohort of offspring, except the June and July 2007 samples, which are from the second (F2) pond generation. (B). Approximate life-history stages through the course of the experiment. Fish were stained with Alizarin red to highlight bone. (C). Genotype frequencies averaged across all four ponds. Purple, homozygous complete genotype (CC); orange, heterozygote genotype (CL); green, homozygote low genotype (LL). Vertical bars show standard errors on the basis of n= 4 ponds. From Barrett et al. (2008).

Download figure to PowerPoint

This ontogenetic shift in the dominance of fitness was not expected under the “burden of plates” hypothesis, because Eda heterozygotes are characterized by an intermediate level of armor plating relative to the low/low and complete/complete homozygotes. This shift in the dominance of fitness during ontogeny—coupled with the reversal of fitness ranks between the two alternative homozygotes—produced parallel oscillations in allele frequency across all four replicate ponds. These parallel oscillations suggest the possibility of antagonistic pleiotropy, where Eda alleles have opposing fitness effects on different traits at different stages of development. Alternatively, the seasonal changes in frequency of the low allele may reflect a correlated response to selection on other loci that are in LD with the Eda gene. It seems clear that the overall fitness effects of Eda are not solely determined by differences in the level of armor plating. This study demonstrates the power of ecological experiments to reveal causes and mechanisms of fitness variation under natural conditions.

The study also provides a rare example of a case in which it was actually possible to measure fitness variation among alternative genotypes in real time. However, in this particular case the striking patterns of parallel evolution that had been documented in stream systems throughout the northern hemisphere left little doubt that the armor-plating phenotype was subject to strong directional selection in freshwater environments. It is not clear whether this same experimental approach can generally be expected to reveal detectable levels of fitness variation in cases where the adaptive significance of a trait is not obvious from the start.

Integrating Evolutionary and Functional Approaches

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED

Integrative studies of adaptive genetic variation may sometimes be motivated by experimental insights into molecular function, which then provide the impetus to perform population genetic tests to evaluate whether the functional variation is of adaptive significance. In other cases, studies may be initiated by genome scans of DNA polymorphism to identify candidate loci for recent adaptation. Results of such analyses can then motivate experimental efforts to test whether the identified candidate loci do in fact contribute to functional variation in some fitness-related phenotype.

WHAT IS THE ADAPTIVE SIGNIFICANCE OF FUNCTIONAL VARIATION?

In some cases, functional properties of alternative alleles will have been previously characterized and the question is whether the observed variation is adaptive. Many studies of enzyme polymorphism have provided mechanistic descriptions of how biochemical differences between alternative alleles affect metabolic performance, but it is not always clear whether this genetically based variation in enzyme activity is adaptive (Koehn et al. 1983; Zera et al. 1985; Eanes 1987, 1999; Lewontin 1991). Tests of selection on the underlying sequence variation can play an important role in this context, as it may be possible to rule out the possibility that the observed functional variation is either neutral or mildly deleterious (maintained at mutation-selection-drift equilibrium). The partitioning of nucleotide variation and LD within and between functionally defined allele classes can provide corroborative evidence for a history of positive selection at a given locus, and can also provide insights into the intensity and mode of selection and the timescale over which selection has been acting. For example, in studies of nucleotide polymorphism at the human G6pd gene, measures of long-range LD within the “reduced activity” allele class (which confers partial resistance to malaria) revealed that the deficiency mutant was recently driven up to high frequency by directional selection in malarial regions of Africa (Tishkoff et al. 2001; Sabeti et al. 2002; Saunders et al. 2002, 2005; Verrelli et al. 2002). Similarly, in studies of nucleotide polymorphism at the human LCT gene, measures of LD within the allele class associated with adult lactase persistence revealed evidence for a very strong and recent selective sweep in African and Eurasian populations with a pastoralist heritage (Enattah et al. 2007; Tishkoff et al. 2007; Fig. 6).

image

Figure 6. Graphical representation of haplotype homozygosity in the region surrounding independently derived cis-regulatory mutations that are associated with adult persistence of lactase activity in different human populations. (A) Tracts of homozygous SNP genotypes flanking the causative SNP (G/C-14010) are shown for a sample of human subjects from Kenya and Tanzania. Homozygosity tracts in red are associated with the derived SNP allele (C-14010) that contributes to the lactase persistence phenotype, and homozygosity tracts in blue are associated with the wild-type SNP allele (G-14010). (B) Tracts of homozygous SNP genotypes flanking the causative SNP (C/T-13910) are shown for a sample of human subjects from Europe and Asia. Homozygosity tracts in green are associated with the derived SNP allele (T-13910) that contributes to the lactase persistence phenotype, and homozygosity tracts in orange are associated with the wild-type SNP allele (C-13910). In both the African and Eurasian populations, the extended lengths of homozygosity tracts associated with the derived SNP alleles appear to reflect the fact that these variants were recently driven up to high frequency by positive directional selection, so recombination has not yet whittled down the size of the ancestral haplotypes in which the adaptive mutations originated. Nucleotide positions along the x-axis are relative to start codon of the lactase gene. From Tishkoff et al. (2007).

Download figure to PowerPoint

In a study of cryptic coloration in deer mice from the Nebraska Sand Hills (Linnen et al. 2009), a combination of laboratory crosses, association studies, and gene expression experiments revealed that light-colored dorsal pelage (an adaptive, background-matching phenotype) is associated with increased expression of the Agouti gene in hair melanocytes during the first week of postnatal development. The increase in Agouti expression—and the resultant ‘wideband’ phenotype—appears to be caused by a single amino acid deletion and/or a closely linked cis-acting mutation (or mutations). Linnen et al. (2009) surveyed nucleotide polymorphism in and around the Agouti gene of Sand Hills mice, and results of a composite likelihood ratio test suggested a past history of positive directional selection. Relative levels of polymorphism in the wideband and wild-type haplotypes and estimates of allele age suggested that the adaptive allele originated de novo and was subsequently driven to high frequency sometime after the formation of the Sand Hills (Fig. 7). If the adaptive allele originated after the mice colonized the novel dune field habitat, then the implication is that adaptation to changing environmental conditions is not always exclusively dependent on standing genetic variation in the ancestral source population.

image

Figure 7. An adaptive Agouti allele that contributes to cryptic coloration of deer mice from the Nebraska Sand Hills appears to have arisen after the formation of the dune field habitat (a minimum of 8000 years ago). (A) The Agouti allele class that harbors the causative mutation(s) (the “wideband” haplotype) harbors far less polymorphism than the wild-type allele class. Rows are observed haplotypes for exon 2 and flanking regions; columns represent polymorphic nucleotide positions (black = ancestral, white = derived). Arrows indicate the position of a derived amino acid deletion that is associated with the wideband phenotype. Numbers of segregating sites and nucleotide diversities (S and π, respectively) are given for both allele classes. (B) Posterior probability distribution for the estimated age of the adaptive wideband allele in Sand Hills deer mice (assuming Ne= 10,000 and 2 generations per year). From Linnen et al. (2009).

Download figure to PowerPoint

Investigations into the effects of a putative cation exchanger protein, SLC24A5, on pigmentation variation in zebra fish and humans provide an excellent example of how population genetic analyses can provide insights into the adaptive significance of functional variation (Lamason et al. 2005). In zebra fish, the golden phenotype is characterized by delayed and reduced development of melanin pigmentation, and similarities between the melanosomes of zebra fish golden mutants and light-skinned humans suggested the possibility that the human ortholog of the golden gene could play a role in controlling skin pigmentation. Lamason et al. (2005) used a combination of positional cloning, morpholino knockdown, DNA and RNA rescue, and expression analysis to identify SLC24A5 as the golden gene in zebra fish, and conservation of gene function was demonstrated by the ability of human SLC24A5 mRNA to rescue melanin pigmentation when injected into golden zebra fish embryos. To assess the role of SLC24A5 in the evolution of skin pigmentation differences among human ethnic groups, Lamason et al. (2005) examined patterns of nucleotide polymorphism in the SLC24A5 gene region in the International Haplotype Map (HapMap) database. The analysis of sequence data revealed a nonsynonymous single nucleotide polymorphism (SNP), rs1426654, that exhibited a pronounced allele frequency difference among human population groups. The frequency of the derived amino acid variant ranged from 0.93 to 1.00 among different European–American population samples, and it ranged from 0.00 to 0.07 in African, indigenous American, and East Asian population samples. In fact, the allele frequency difference between European and African population samples fell in the upper 0.01 percentile of SNP markers in the HapMap database, suggesting a history of population-specific selection. The high level of differentiation appears to be attributable to a selective sweep in the ancestral population of contemporary Europeans, as the SLC24A5 gene and the flanking chromosomal regions exhibited a striking haplotype structure and a sharply reduced level of nucleotide diversity (Lamason et al. 2005; Sabeti et al. 2007; Norton et al. 2007; Williamson et al. 2007; Coop et al. 2009; Pickrell et al. 2009; Grossman et al. 2010). Moreover, the SLC24A5 gene region has been repeatedly identified as a highly significant outlier locus in multiple genome scans for positively selected loci (Akey 2009). Corroborating the role of the SLC24A5 gene in the adaptive differentiation of skin pigmentation between European and African populations, Lamason et al. (2005) also documented that the rs1426654 SNP was strongly associated with levels of skin pigmentation in samples from two recently admixed groups, African–Americans and African–Carribbeans (Fig. 8). Results of this association study indicated that the SLC24A5 polymorphism explains 25–38% of the difference in skin melanin index between Europeans and Africans. In this study, the functional analysis was sufficient to reveal the mechanistic basis of pigmentation differences, but the population genetic analysis of sequence polymorphism provided additional insights into the role of this gene in the adaptive evolution of human skin color.

image

Figure 8. Effect of SLC24A5 genotype on skin pigmentation in two admixed human populations, African–Americans and African–Caribbeans. (A) Variation in skin pigmentation index with estimated African ancestry and SLC24A5 genotype. Each point represents a single individual. Lines show regressions constrained to have equal slopes, for each of the three SLC24A5 genotypes (GG, AG, and AA). (B) Histograms showing the distribution of pigmentation scores associated with each genotype after adjusting for the estimated percentage of African ancestry. Values shown are the differences between the measured melaning index and the calculated regression line for the GG genotype. From Lamason et al. (2005).

Download figure to PowerPoint

WHAT IS THE FUNCTIONAL SIGNIFICANCE OF (PUTATIVELY) ADAPTIVE VARIATION?

Recent years have witnessed a growing interest in conducting genome scans of DNA polymorphism to identify genes or gene regions that have contributed to adaptive evolution (Nielsen et al. 2007; Thornton et al. 2007). Unlike marker-based mapping approaches that use phenotypic variation as a starting point, genome scans for selected loci are not constrained by preconceived ideas about the specific traits that may be involved in adaptation to changing environmental conditions. When coupled with functional studies of the candidate genes for adaptation that are ultimately identified, this genome scan approach holds the promise of identifying fitness-related traits whose function and adaptive significance were previously unanticipated. In cases in which it is possible to obtain relatively precise estimates of the chromosomal location of selected loci, it should be possible to conduct follow-up studies to fine-map functionally significant variation that may have contributed to adaptive phenotypic change. For example, Turner et al. (2008) used an Arabidopsis thaliana DNA microarray to measure genome-wide patterns of differentiation between populations of A. lyrata that are adapted to different soil types. One of the candidate genes for local adaptation to serpentine soil, calcium-exchanger 7, may confer increased tolerance to low calcium:magnesium ratios that are characteristic of this soil type. This hypothesis about the genetic basis of local adaptation could be tested by performing reciprocal transplant experiments with lines that are isogenic for alternative alleles at the calcium-exchanger 7 gene.

This type of genome-scan approach will become commonplace with the advent of next-generation sequencing technologies (Pool et al. 2010), as it is becoming increasingly feasible to generate dense genomic polymorphism data in nonmodel organisms (Miller et al. 2007; Baird et al. 2008; Vera et al. 2008; Rokas and Abbot 2009; Wheat 2010; Hohenlohe et al. 2010). However, it is important to point out that, despite this newfound ability to generate dense genomic polymorphism data, studies of nonmodel organisms will not typically be able to harness the full power of population genomic approaches for identifying chromosomal regions that have contributed to a past response to selection. For example, composite likelihood methods for localizing the chromosomal position of positively selected sites (e.g., Kim and Stephan 2002; Nielsen et al. 2005b; Zhu and Bustamante 2005; Williamson et al. 2007) require information about fine-scale variation in recombination rates and other basic genetic parameters that can only be obtained by integrating genetic and physical maps.

PROSPECTS FOR COMBINING POPULATION GENOMICS AND MECHANISTIC BIOLOGY

When properly designed and executed, we know that bottom-up functional approaches and top-down population genomic approaches can both succeed in identifying selected loci that have contributed to a past history of adaptive phenotypic change (Dean and Thornton 2007; Jensen et al. 2007a; Akey 2009; Dalziel et al 2009; Grossman et al. 2010). We also know that both approaches have different limitations and are associated with different forms of ascertainment bias. One limitation of the purely functional approach is that a focus on well-characterized candidate genes or candidate pathways may rarely lead to the discovery of novel adaptive mechanisms. This suggests an obvious benefit of combining “hypothesis-driven” functional studies of candidate genes with “discovery-driven” population genomic approaches that are unconstrained by a priori expectations regarding the identity of potentially adaptive phenotypes. One important limitation of the population genomics approach is that—for reasons discussed above—genome scans for signatures of selection may identify a biased subset of all loci that have actually contributed to past adaptation, and may therefore implicate selected traits that have an unrepresentative genetic architecture. Specifically, genome scans for positively selected loci may implicate a disproportionate number of traits with a simple genetic architecture (conforming to the scenarios diagrammed in Figs. 1A and B) at the expense of traits with a more complex genetic architecture (conforming to the scenario diagrammed in Fig. 1C). Loci that contribute to polygenic adaptation may be systematically underrepresented in genomic maps of positive selection, and this could also lead to biased inferences about the types of loci that are most likely to contribute to adaptive change.

In cases where the ecological context suggests specific hypotheses regarding the identity of adaptive phenotypes, genome scans for positive selection could be combined with QTL-mapping or association mapping approaches to assess connections between genotype and phenotype, and (indirectly) between genotype and fitness. In this way, the genetic architecture of the trait in question could be characterized in conjunction with population genetic tests of positive selection at causative loci. A nice example of how marker-based mapping methods can be integrated with genome scans for selection is provided by studies of phenotypic differentiation between benthic and limnetic ecomorphs of lake whitefish (Coregonus clupeaformis; Rogers and Bernatchez 2005, 2007). Ecological character displacement has promoted the parallel divergence of dwarf (limnetic) and normal (benthic) whitefish ecomorphs in several different postglacial lakes in Maine, USA, and Quebec, Canada. Rogers and Bernatchez (2007) conducted a linkage mapping analysis involving two separate hybrid dwarf × normal backcross families to identify QTL for several behavioral, physiological, morphological, and life-history traits that differed between the two ecomorphs. Using a panel of DNA markers with known map positions, the authors surveyed genome-wide patterns of differentiation between sympatric pairs of dwarf/normal ecomorphs from four separate lakes and identified outlier loci characterized by FST values that exceeded simulation-based neutral expectations. A disproportionate number of outlier loci co-localized with QTL for traits related to swimming behavior, growth, and gill-raker number, implicating a role for divergent selection in maintaining genetically based trait differences between the two forms. More recently, a genome scan of nucleotide variation between marine and freshwater populations of threespine stickleback revealed several candidate regions for divergent selection that co-localized with QTL for morphological traits that are thought to contribute to freshwater adaptation (Hohenlohe et al. 2010).

Independent evidence for phenotype–environment associations may often provide a necessary starting point for functional studies of candidate loci that are identified in genome scans. With a measurable phenotype in view, it is then possible to design experiments to identify genetic mechanisms of adaptation (i.e., the genes [RIGHTWARDS ARROW] design link in the adaptive recursion). If functional tests reveal no discernible differences between alternative alleles at the candidate locus (e.g., Runck et al. 2010), then there are two possible explanations: either the observed polymorphism is of no adaptive significance or the experimental conditions were not appropriate for exposing the salient differences. In the latter case, assay conditions may not adequately replicate biologically relevant features of the organism's natural environment, or it may be that the acuity of natural selection simply exceeds the resolving power of our experimental methods. If functional tests do reveal biochemical differences between the products of alternative alleles, it is then possible to work up through higher levels of biological organization to establish mechanistic connections between design, performance, and fitness. This requires a mechanistic understanding of gene function in physiological context. As stated by Feder and Watt (1992:382): “Notable successes in identifying strong and straightforward causal relationships among design, performance, and fitness … examine traits that are biochemical variants of single gene products, whole-organism performances whose rate or duration are amenable to measurement in the field (e.g., flight in insects, growth in molluscs), and fitness components that are simple, unambiguous functions of performance …” In principle, each of the intermediate connections between genotype and fitness can be measured experimentally.

The only causal link in the adaptive recursion that is not generally open to direct, experimental scrutiny is the “fitness [RIGHTWARDS ARROW] genes” connection, whereby differences in the net reproductive rates of alternative genotypes (based on lifetime integrals of survivorship and fecundity) alter the composition of the gene pool in the following generation. Population genetic tests of selection can provide valuable insights into this connection by integrating the cumulative effects of fitness variation over thousands of past generations. However, indirect, retrospective inferences about adaptation ultimately need to be buttressed by functional data. For the purpose of identifying the mechanistic basis of fitness variation in nature, there is no substitute for low-throughput experimental biology.

Associate Editor: M. Rausher

ACKNOWLEDGMENTS

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED

We thank R. Barrett, R. K. Butlin, L.-M. Chevin, Z. A. Cheviron, G. Coop, W. A. Cresko, D. A. Hahn, J. D. Jensen, J. K. Kelly, R. G. Latta, P. Pennings, M. D. Rausher, W. B. Watt, A. J. Zera, and two anonymous reviewers for helpful comments. JFS acknowledges grant support from the National Science Foundation and the National Institutes of Health, and CWW acknowledges support from the National Science Foundation and D. Heckel at the Max Planck Institute for Chemical Ecology.

LITERATURE CITED

  1. Top of page
  2. Abstract
  3. Functional Inferences About Adaptation at Specific Loci
  4. Population Genetic Inferences About Adaptation at Specific Loci
  5. Case Studies that Integrate Evolutionary and Functional Analyses of Natural Variation
  6. Integrating Evolutionary and Functional Approaches
  7. ACKNOWLEDGMENTS
  8. LITERATURE CITED