There are many research programs that have made tremendous progress in characterizing genes of large effect associated with ecologically important traits [e.g. flowering time in Arabidopsis thaliana (reviewed by Ehrenreich & Purugganan 2006; Roux et al. 2006; Shindo et al. 2007); body armour and colouration in threespine stickleback (e.g. Shapiro et al. 2004; Colosimo et al. 2005; Miller et al. 2007); for additional examples, see supporting information Table S1]. In addition, several groups have already identified, or are on the cusp of identifying, candidate genes for ecologically important traits in a variety of plant and animal species [e.g. whitefish (Rogers & Bernatchez 2007; Whiteley et al. 2008; Jeukens et al. 2009; Nolte et al. 2009); sunflowers (e.g. Kane & Rieseberg 2007; Sapir et al. 2007; Lai et al. 2008); Bochera stricta, a close relative of Arabidopsis (e.g. Schranz et al. 2009); wild tomatoes (e.g. Moyle 2008); marine snails (e.g. Wood et al. 2008; Galindo et al. 2009); additional examples reviewed in Karrenberg & Widmer (2008)]. Given this, we expect that many more candidate genes associated with ecologically important traits will be characterized in the near future, although this will remain difficult for genes of small effect. Below, we discuss in more detail three classic pre-genomics research programs that have integrated across levels of biological organization to study the effects of genetic variation at a selected locus (Fig. 2). Each of these research programs used knowledge of mechanism to provide clear, testable hypotheses about the links between genotype, phenotype and fitness.
Lactate dehydrogenase and thermal adaptation of metabolism in killifish
The common killifish (Fundulus heteroclitus) is a small teleost fish that lives in marshes and estuaries along the Atlantic coast of North America. There is a steep latitudinal thermal cline over this species’ range such that northern populations experience temperatures that are, on average, 13 °C colder than those experienced by southern populations at the same time of year (reviewed by Powers & Schulte 1998). Dennis Powers and colleagues initiated the search for genes that are differentially selected between northern and southern populations of this species over 30 years ago, using allozyme screening; the ‘genome scans’ of the pre-genomic era (Place & Powers 1978; Powers & Place 1978). Analyses of allozyme frequencies detected clines at a number of loci, including LDH-B, an enzyme that catalyzes the interconversion of pyruvate (a fuel for aerobic respiration) and lactate (the end product of anaerobic glycolysis). When Place & Powers (1979, 1984a, b) examined the kinetics of purified LDH-B enzymes in vitro they found that the northern LDH-B enzyme (LDH-BNN) had a higher catalytic efficiency at low temperatures, as would be predicted if local adaptation to low temperatures had occurred in northern populations, but did not find evidence for local adaptation of the southern LDH-B (LDH-BSS) genotype to warmer temperatures. Sequence analyses of LDH-B alleles suggested that a particular amino acid variant at site 311 (Ala → Asp) was responsible for functional differences among LDH-B alleles (reviewed by Powers & Schulte 1998).
Powers et al. (1979) also discovered that ATP concentrations ([ATP]) in red blood cells are correlated with LDH-B genotype. Since ATP decreases haemoglobin-oxygen binding affinity, DiMichele & Powers (1982b) predicted that LDH-BNN fish, which have higher [ATP], would have a lower haemoglobin-oxygen binding affinity, allowing for more efficient unloading of oxygen at the working muscles and an improvement in endurance swimming performance (a trait that is highly dependent upon oxygen availability, transport and use). As expected, LDH-BNN fish had higher [ATP], lower haemoglobin-oxygen affinity and superior endurance swimming performance at 10 °C, a temperature rarely experienced by southern fish. However, there were no differences in haemoglobin-oxygen affinity, [ATP] or swimming performance at 25 °C, which is consistent with in vitro studies of LDH-B kinetics that find no differences among genotypes at warm temperatures (DiMichele & Powers 1982b).
Powers and colleagues next examined a suite of other performance traits that are influenced by environmental temperatures, including embryonic metabolic rate, growth rate and hatching time. Since killifish lay eggs at the peak of the highest spring tide, their eggs must develop in air and hatch at the next high tide 2 weeks later to survive (reviewed by Dimichele et al. 1986). Thus, these traits were expected to be under strong divergent selection between northern and southern Fundulus populations that are developing at different temperatures in the wild. They found that LDH-BNN embryos had lower metabolic rates, slower development, later hatching times, decreased lactate metabolism and decreased glucose production (reviewed by Dimichele & Powers 1982a, 1984a,b; Dimichele et al. 1986; Paynter et al. 1991). Note that these differences in metabolism, hatching and growth are opposite to what would be expected for local adaptation to a colder environment (where faster development would be expected to evolve to counter a slowing of metabolism due to colder temperatures). These correlations between genotype and cellular and organismal phenotype were then directly tested by exchanging the native LDH-B enzymes of an egg (e.g. LDH-BNN) with the alternate LDH-B enzyme (e.g. LDH-BSS). Dimichele et al. (1991) found that the injected LDH-B enzyme determined the metabolic rate and glucose use of the egg, showing that it was LDH-B, and not a linked locus, that caused the observed differences in cellular metabolism and embryonic development.
All of the in vivo experiments described above were performed on Fundulus from the hybrid zone between northern and southern genotypes, near the middle of the LDH-B cline [In Delaware, which is also near the centre of other allozyme clines (Powers & Place 1978)]. Therefore, LDH-BSS and LDH-BNN genotypes were tested in individuals with a mixture of northern and southern alleles at other loci and observed phenotypic variation between these two genotypes could be attributed to their LDH-B genotype (or to closely linked loci) rather than to correlated variation at other (unknown) loci. In addition, collecting fish from a single locality controlled for prior thermal history. However, more recent work on Fundulus from extreme northern and southern populations found either no differences in performance associated with LDH-B genotype [swimming performance (Fangue et al. 2008)], or differences in the opposite direction to the effects of LDH-B genotype alone [development rate to hatching (DiMichele & Westerman 1997), growth rate following hatching (Schultz et al. 1996) and adult metabolic rate (Podrabsky et al. 2000; Fangue et al. 2009)]. These observations are consistent with experiments on fish from the centre of the LDH-B cline in Delaware that examined multilocus genotypes at several allozymes, as opposed to LDH-B in isolation. Dimichele & Powers (1991) found that fish bearing the most common northern multilocus genotype had faster development, despite the fact that at the single locus level fish bearing the LDH-BNN genotype developed more slowly (Dimichele & Powers 1991). Thus, for at least growth rate post-hatch, examining LDH-B alone does not give a true picture of the differences among populations.
The search for the other loci influencing metabolism, hatching and growth is now underway (e.g. Whitehead & Crawford 2006). Interestingly, there are also differences in the amounts of LDH-B enzyme among northern and southern genotypes that are mediated by differences in transcriptional regulation (Crawford & Powers 1992). A combination of comparative sequence analyses, in vitro experiments and in vivo tests of promoter action found that differences in transcription are largely because of sequence variation in a cis-regulatory region upstream of the Ldh-B gene (Schulte et al. 1997, 2000) and SP1 sites in the proximal promoter (Segal et al. 1999). Analyses of molecular signatures of selection (Schulte et al. 1997) and phylogenetic comparative studies (Pierce & Crawford 1997) suggest that natural selection shaped these transcriptional differences (reviewed by Schulte 2001). In addition, Whitehead & Crawford (2006) have used comparative phylogenetic methods to identify 13 other metabolic genes that show evidence of selection for changes in expression in response to habitat temperature (or environmental factors correlated with temperature).
Phosphoglucose isomerase, flight and thermal adaptation in Colias butterflies
Ward Watt and colleagues began their research on phosphoglucose isomerase (PGI) polymorphisms in Colias butterflies by selecting this gene as a candidate underlying local adaptation to environmental temperature (Watt 1968; Sherman & Watt 1973). Colias butterflies eat nectar, a mixture of simple sugars, to fuel their flight. Based on this observation, Watt (1977) hypothesized that selection for optimal flight performance could act to fine tune glycolysis, a pathway involved in sugar metabolism, to environmental temperature. More specifically, he hypothesized that PGI would be the target of selection in response to environmental temperature as this homodimeric enzyme catalyzes the reversible conversion of fructose-6-phosphate to glucose-6-phosphate, and sits at a key branch point in glycolysis that links substrates into other pathways such as gluconeogenesis.
Watt (1977) surveyed populations of four species of Colias butterflies (Colias meadii, Colias alexandra, Colias philodice eriphyle and Colias eurytheme) for allozyme variation at PGI, and found a number of allelic variants or electromorphs (EM). Interestingly, there was an excess of heterozygotes in older butterflies when compared with younger butterflies, suggesting differential survival of genotypes (Watt 1977). Identical-by-descent laboratory-raised populations for each of the four most common allelic variants for C. eurytherme were produced, so that genotypes could be tested for differences in in vitro biochemical functioning. There were a number of biochemical differences among the alleles, including thermal stability and substrate-binding affinity (Km) (Watt 1977, 1983). The most striking result from these biochemical measurements was that homodimeric enzymes showed a trade-off between thermal stability and enzyme kinetics, whereas heterodimeric enzymes did not (Watt 1977, 1983). Thus, heterodimeric enzymes, with one allele optimized for stability and the other for catalytic efficiency, functioned better than homozygotes over a wide range of temperatures. Watt et al. (1996) found that PGI enzymes in C. meadii, although unique in origin and sequence, also show similar trade-offs between thermal stability and kinetics as in C. eurytheme enzymes.
These remarkable differences at the biochemical level generated clear predictions about how these alleles might affect whole animal physiology and performance in the wild. Watt predicted that heterozygotes should be able to fly at a wider range of environmental temperatures because their metabolic pathways would be able to function well across a range of temperatures. Indeed, C. p. eriphyle, C. eurytheme and C. meadii heterozygotes were able to fly earlier in the day (when it is cold), and fly for a longer overall time each day (Watt 1983; Watt et al. 1983). These differences in performance were also hypothesized to affect fitness components that depend on capacity for flight or thermal tolerance, such as mating success and/or fecundity. As predicted, survival during heat stress in C. p. eriphyle, male mating success in C. p. eriphyle, C. eurytheme and C. meadii, and female fecundity in C. p. eriphyle were all highest for heterozygous butterflies; thus, heterozygotes had a greater net fitness (Watt 1983, 1992; Watt et al. 1983, 1985, 1996, 2003; Carter & Watt 1988).
Phosphoglucose isomerase alleles have now been sequenced from C. eurytherme and C. meadii, and there are multiple amino acid changes among and within EM classes and among species that display evidence of evolution via natural selection (Wheat et al. 2006). While the exact mutation(s) underlying differences in thermal adaptation is still unknown for Colias, the most promising candidates lie in the region of PGI’s tertiary protein structure that links the two monomers to form a functional enzyme. Interestingly, the exact sites mutated vary from species to species, but occur in the same protein region (Wheat et al. 2006; Wang et al. 2009).
Similar evidence for the effects of PGI genotype on phenotype and performance has been found in the Glanville fritillary butterfly (Melitaea cinxia). M. cinxia heterozygote individuals have a higher body temperature, flight metabolic rate and dispersal distance at colder temperatures, which results in higher fecundity (e.g. Haag et al. 2005; Niitepõld et al. 2009; Saastamoinen & Hanski 2008) and population growth (Hanski & Saccheri 2006). Heterozyotes also have increased survival (Orsini et al. 2009) and a longer lifespan (Saastamoinen et al. 2009). The impacts of PGI genotype on thermal adaptation are not limited to butterflies. For example, there is evidence for local adaptation of PGI alleles to temperature in the willow beetle (Chrysomela aeneicollis) (Dahlhoff & Rank 2000) and the sea anemone Metridium senile (Zamer & Hoffmann 1989). These studies, in combination with strong empirical evidence (i.e. measuring genotype frequencies across life history stages, measuring fitness components, and testing for genetic signatures of selection) from Colias butterflies, support the hypothesis that PGI evolves by natural selection in Colias spp., and is a gene with major effects.
Voltage-gated sodium channel (Nav1.4), poison resistance and locomotion in garter snakes
Garter snakes (Thamnophis siralis) feed on rough-skinned newts (Taricha granulose) in the regions of western North America where these two species overlap. To defend themselves from predators the newts contain a toxin, tetrodotoxin (TTX), in their skin (Wakely et al. 1966; Brodie et al. 1974). TTX is a very potent neurotoxin, which binds to, and blocks, the outer pore of voltage-gated sodium channels (Nav) in neurons and muscles. At the cellular level, blocking these channels inhibits the initiation of action potentials, which are necessary for nerve and muscle function. When even minute amounts of TTX are ingested, muscles become paralyzed and poisoned animals usually die by suffocation (reviewed by Soong & Venkatesh 2006). These devastating consequences of ingesting TTX are expected to strongly select for the evolution of TTX resistance, and as predicted, garter snakes from newt-eating populations have been shown to have greater resistance to TTX (Brodie & Brodie 1990; Brodie et al. 2002). As well, variation in TTX levels in newts is geographically correlated with levels of resistance in snake populations (Brodie & Brodie 1991; Hanifin et al. 1999), making this system a classic example of a co-evolutionary ‘arms-race’ (Brodie & Brodie 1999; Brodie et al. 2002).
Resistance to TTX was originally measured by injecting snakes with TTX and then testing muscle contraction ability via crawling, a performance trait important for escape from predators and prey capture (Brodie & Brodie 1990). TTX resistance, measured using this performance trait, did not vary when young snakes were repeatedly injected with TTX (Ridenhour et al. 1999) or among laboratory-reared and field-caught snakes (Ridenhour et al. 2004), which suggested that TTX resistance had a genetic basis and was not dependent on environmental factors. Crawling performance, a whole-animal performance measure of TTX resistance, was strongly correlated with a cellular measure of resistance: the ability for action potentials to propagate when an animal was exposed to TTX (Geffeney et al. 2002). A priori knowledge of the mechanism of action of TTX (i.e. that it binds to sodium channels) suggested that this resistance might be mediated at the biochemical level by the presence of TTX-resistant sodium channels (reviewed by Soong & Venkatesh 2006). Geffeney et al. (2005) tested this hypothesis by looking for sequence variation in the voltage-gated sodium channel gene, Nav1.4 (the isoform expressed in muscle), between resistant and susceptible garter snakes and assessing the impacts of variants on protein function in vitro. Knowledge of the structural interaction between TTX and the outer pore of the Nav1.4 enzyme allowed for clear predictions about the location of these mutations in the protein sequence. In vitro assays of TTX binding to the sodium channel (Nav1.4) demonstrated that a single mutation, found in all resistant populations, could decrease TTX binding to Nav1.4, and that the 1–3 additional non-synonymous changes found in the most resistant populations further decreased TTX binding and increased resistance (Geffeney et al. 2005). These data suggest that a great deal of the variation in TTX resistance in garter snakes can be explained by the four amino acid changes in the outer pore of the Nav1.4 enzyme, and were consistent with biochemical knowledge of this sodium channel (Fig. 2).
Feldman et al. (2009) have recently expanded this work to examine two congeners of T. siralis, Thamnophis atratus and Thamnophis couchii, that also contain TTX-resistant populations. They found that SNPs in the protein regions which form the outer pore of the Nav1.4 channel also correlate with TTX resistance in these species. However, the specific mutations that confer resistance varied among species, suggesting that resistant alleles have evolved independently (Feldman et al. 2009), and thus represent a case of convergent evolution at the nucleotide level with parallel evolution at higher levels of organization. Polymorphisms in the sodium channel gene also underlie resistance to a structurally and functionally similar neurotoxin, saxitoxin, in a wild-clam population (Bricelj et al. 2005).
Lessons learned from these examples
The examples discussed above, and listed in Table S1, provide a number of valuable lessons. The first and over-riding lesson is that mechanistic knowledge can be used to generate testable hypotheses about the effects of a particular genetic polymorphism at higher levels of biological organization. Two of the examples (i.e. LDH and PGI) discussed in detail above started at the biochemical level and worked ‘up’ to cellular phenotypes, organismal phenotypes and fitness, but in principle, such hypotheses could be generated beginning at any level in the biological hierarchy. The second important lesson is that a purely mechanistic approach has its limits. In particular, this approach has the potential to bias the search for genes underlying ecologically important traits towards well-understood biochemical pathways and miss other genes-affecting fitness. The incorporation of a top-down approach, first elucidating phenotype-environment associations followed by complementary marker-based approaches (e.g. quantitative trait locus (QTL) mapping, linkage disequilibrium mapping and/or genome scans) is likely to reduce the impacts of this ascertainment bias (e.g. Rogers & Bernatchez 2007; Whiteley et al. 2008), and detect other loci underlying a trait of interest. Once these loci are detected, available mechanistic knowledge can give insight into the molecular basis of genetic interactions, if present. For example, knowledge of the genes that underlie ecologically important differences in flowering time in A. thaliana (e.g. Ehrenreich et al. 2009; Flowers et al. 2009) coupled with extensive knowledge about the biochemical pathways underlying flowering time has guided studies on the epistatic interactions among loci, such as the interactions between flowering locus C (FLC) and FRIGIDA (FRI) genotype (Caicedo et al. 2004; Michaels & Amasino 2001; reviewed by Ehrenreich & Purugganan 2006; Mitchell-Olds & Schmitt 2006; see Table S1 for further references).
The third important lesson is that it is critical to ensure that experimental conditions are as ecologically relevant as possible when assessing the effects of genetic variation on organismal phenotypes and fitness (discussed by Ungerer et al. 2008). For example, differences in PGI and LDH function were only seen when these enzymes, and animals, were studied at certain temperatures (e.g. Watt 1977, 1983; DiMichele & Powers 1982b).
The final and perhaps most critical lesson from these examples is that isolating the impacts of a single gene in natural populations with high background genetic variation (i.e. variation at other loci throughout the genome) is necessary to firmly establish the causal link between gene, phenotype and fitness. This is clearly seen in the case of LDH-B in Fundulus, in which the effects of LDH-B genotype vary widely depending on the genetic background in which the alleles are tested. Performing experiments without controlling for genetic background can reduce the power to infer causal relationships between genotype and phenotype, or even result in false conclusions (discussed by Dean & Thornton 2007). Thus, in the section below, we explore some of the available methods for controlling for background genetic variation in studies attempting to link genetic variation at a candidate gene to effects on phenotypes, performance and fitness.