SEARCH

SEARCH BY CITATION

Keywords:

  • coalescent;
  • F ST ;
  • gene–environment association;
  • local adaptation;
  • metapopulation;
  • population differentiation

Abstract

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Summary

Research into the evolution of subdivided plant populations has long involved the study of phenotypic variation across plant geographic ranges and the genetic details underlying that variation. Genetic polymorphism at different marker loci has also allowed us to infer the long- and short-term histories of gene flow within and among populations, including range expansions and colonization–extinction dynamics. However, the advent of affordable genome-wide sequences for large numbers of individuals is opening up new possibilities for the study of subdivided populations. In this review, we consider what the new tools and technologies may allow us to do. In particular, we encourage researchers to look beyond the description of variation and to use genomic tools to address new hypotheses, or old ones afresh. Because subdivided plant populations are complex structures, we caution researchers away from adopting simplistic interpretations of their data, and to consider the patterns they observe in terms of the population genetic processes that have given rise to them; here, the genealogical framework of the coalescent will continue to be conceptually and analytically useful.

I. Introduction

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Plant populations are often separated from one another by areas of unsuitable habitat over which migration and gene flow are limited. Even populations that occupy apparently homogeneous habitat over large areas can be structured because of limited dispersal and local mating (Neigel, 1997). Groups of individuals occupying different parts of a species range can thus end up evolving relatively independently of one another under the influence of drift and local selection. It is becoming increasingly clear that, even within continuous populations, environmental heterogeneity can bring about fine-scaled genetic structure with the evolution of local adaptation (Audigeos et al., 2013).

The extent to which plant populations are genetically divergent depends on the balance between processes that drive them apart and those that homogenize them. For neutral loci, surprisingly small amounts of gene flow can prevent much genetic divergence between demographically stable populations (Wright, 1931; Slatkin, 1985, 1987; Hartl & Clark, 1997). For loci under selection, genetic divergence can be restricted or enhanced, relative to the neutral case, depending on whether different phenotypes are selected in different populations or whether the same phenotype is selected globally. An important corollary is that, because action of selection is expected to vary among loci, the magnitude of population divergence will be locus specific. This complicates how we describe population structure, but the differences in population structure among loci provide us with a powerful means to tease apart the effects of natural selection from drift and the effects of demographic processes or history.

Variation in divergence among loci reminds us not only that plant populations are fragmented geographically, but also that the genome is fundamentally fragmented. As Darlington (quoted in Lewontin, 1980) put it, ‘the really important small populations are the little bits of chromosomes that are populations within which recombination cannot occur’. Here, Darlington was referring to inverted chromosomal regions in which recombination is completely suppressed, so that there are effectively two different populations of genes at the same locus that do not mix by recombination. However, even loci in genomic regions that continue to recombine may be in gametic disequilibrium, that is, may be associated with each other non-randomly. Whether locus-specific divergence evolves in response to selection for local adaptation will thus often depend on genetic correlations among traits (Etterson & Shaw, 2001).

With rapid advances in sequencing technologies and high-throughput data analysis, subdivided plant populations are providing new opportunities to study the evolutionary forces influencing genetic divergence across the genome. Population genetic studies have benefited from the use of an increasing number of individuals and loci (Lascoux & Petit, 2010). Guichoux et al. (2011) recently identified c. 8000 published population genetic analyses utilizing simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) in 2009 alone, many assessing the presence and consequences of population subdivision. Continued advances in second-generation sequencing technologies and analytical methods promise to accelerate these trends (Wang & Hey, 2010; Cao et al., 2011). The time is ripe to consider what these studies can tell us about the evolution of plants across their fragmented landscapes, and what sorts of questions we might now address.

In this article, we review advances in our understanding of the evolution of subdivided plant populations from a conceptual point of view, beginning with a non-technical discussion of effective population sizes, migration and the characterization and interpretation of genetic structure, which has often been measured in terms of the statistic FST. We discuss the utility of FST as a basis for inferring the demographic and selective history of populations, highlight new studies that are moving beyond the use of FST to infer a population's evolutionary history, and ask how new data, and the new ways of dealing with it, allow us to understand the distribution of genetic variation across subdivided plant species.

II. Effective population sizes, genetic drift and migration

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

The effective population size Ne is a parameter that enters into many expressions in population genetics, often as a product with other parameters, such as the absolute mutation rate (Neu), the absolute migration rate (Nem), the selection coefficient (Nes) and the recombination rate (Ner). It is thus an important scaling parameter for the evolutionary forces that affect a population's evolution. Ne is sometimes loosely defined in terms of the number of individuals contributing genetically to future generations. But, how far into the future, and contributing what? Although the number of breeding individuals will influence Ne, it is more useful to think of the effective population size as a parameter that determines the extent to which the population is subject to genetic drift: the smaller Ne, the greater the extent to which genetic measures that interest us will be affected by drift. With this in mind, it should be no surprise that there are several effective sizes, each addressing a different effect of drift. The inbreeding effective size determines how quickly populations become inbred through simple random mating (relatives are more likely to mate with one another in populations with small Ne); the variance effective size determines the extent to which allele frequencies fluctuate from generation to generation (allele frequencies fluctuate more in populations with small Ne); the mutation effective size determines how quickly the genetic diversity at a particular locus should equilibrate to a new drift–mutation equilibrium (populations with small Ne maintain less diversity). With the focus on drift, it should also be clear why Ne is locus specific: loci under selection (or those linked to them) will be subject to different fluctuations in frequency from those unaffected by selection.

The effect of drift on genetic diversity is complicated by population subdivision. To see this, it is helpful to consider drift from the perspective of the coalescent, that is, by tracing lineages backwards in time through their genealogy (Hudson, 1990; Hein et al., 2005; Wakeley, 2009; Fig. 1). In a population subdivided into small populations (or demes) linked by little migration, individuals in each deme will be more closely related to one another, on average, than to individuals in other demes. Thus, the initial coalescence of lineages sampled from the same deme will tend to be rapid, occurring at a rate determined by the local inbreeding effective size of the deme (e.g. all the coalescence events among lineages sampled in deme 1 in Fig. 1). Nevertheless, as long as the migration rate into the sampled deme is not zero, there is a chance that one or more of the lineages in our sample are recent migrants (e.g. lineages sampled in deme 2 of Fig. 1); in this case, tracing back to the common ancestor of our sample requires us to follow lineages until they find themselves in the same deme again, at which point they will coalesce at a rate given by the inbreeding effective size of that deme (Slatkin, 1991). There are thus two phases to the coalescent in a subdivided population, each pointing to a different effective population size: the local inbreeding effective size (the short-term rate of coalescence) and the eigenvalue effective size of the whole metapopulation (the long-term rate of coalescence, which will be inversely related to the migration rate). These two phases have been termed the ‘scattering phase’ (referring to the migration of lineages out of sampled demes as one traces their ancestry into the past) and the ‘collecting phase’ (the migration of lineages back to the same demes before their ultimate coalescence) by Wakeley (2000, 2001). The two phases can, in principle, be discerned by ‘skyline’ plots (Strimmer & Pybus, 2001) of the consecutive waiting times until coalescent events for the whole sample (scaled appropriately; Fig. 1), or by assessing the frequency spectrum of nucleotide differences for sequences sampled from the individuals concerned (cf. Fig. 2).

image

Figure 1. The coalescent for a subdivided population. (a) The diagram depicts the genealogy of lineages sampled from three different demes. In demes 1 and 3, all locally sampled lineages coalesce with each other at a rate given by the local inbreeding effective size of the deme. The simultaneous coalescence of multiple lineages in deme 3 would suggest a local population bottleneck caused, for example, by a colonization event. In deme 2, three of the four lineages coalesce locally, but the fourth lineage migrated into the deme from elsewhere. The waiting time to the final coalescent event with this lineage is determined by the eigenvalue effective size of the whole metapopulation, in the collecting phase of the structured coalescent. (b) A ‘skyline plot’ of simulated coalescent events for a subdivided population, showing: first, the scattering phase, where coalescence occurs at a rate governed by the local inbreeding effective sizes of the demes from which more than one lineage was sampled; and second, the collecting phase, where coalescent events occur at a rate determined by the metapopulation effective size, which is strongly influenced by the migration rate. Under strong migration, the two phases become one. (Graph modified from Pannell, 2003.)

Download figure to PowerPoint

image

Figure 2. Coalescence in single populations that have gone through (a) a severe bottleneck and (b) a milder bottleneck. In a severe bottleneck, coalescence will be rapid during the bottleneck, with all coalescences coalescing in rapid succession (or even together, as shown). Here, we are likely to find a single peak in the site frequency spectrum for sequences from a sample (right). In a mild population bottleneck, some of the lineages will coalesce during the bottleneck, but those that do not may take much longer to coalesce. In this case, we expect to see two peaks in the site frequency spectrum, in a pattern resembling the coalescence events in a structured coalescent.

Download figure to PowerPoint

Although patterns in DNA sequence diversity provide a potentially powerful means of reading the history of a sample and the population from which it was drawn, quite different processes can give rise to very similar patterns. For example, moderate genetic bottlenecks of single populations can distort the shape of the coalescent in ways similar to the effect of population subdivision (Fig. 2): the rate of coalescence is increased during a bottleneck, but lineages that fail to coalesce during the bottleneck may require extended periods to coalescence before it, similar to the long coalescent times observed for lineages sampled from different populations. Some of the complexities of the structured coalescent can be avoided by sampling just a single individual from each deme, although this will limit the conclusions that might be drawn about population subdivision itself (Wright & Gaut, 2005). (In principle, it is possible to detect a genetic signature of population subdivision by sampling individuals from only a single deme if the sample includes genes that share no local common ancestor, because their ancestors migrated into the sampled deme.)

In species in which deme sizes vary over time, especially in the extreme case in which demes become extinct and are later recolonized (a ‘metapopulation’), the effective population size can be dramatically reduced below the total metapopulation census size, depending on the migration rate (Maruyama & Kimura, 1980; Gilpin, 1991; Whitlock & Barton, 1997; Pannell & Charlesworth, 1999; Wakeley, 2001; Wakeley & Aliacar, 2001). In a metapopulation in which the extinction–recolonization rate exceeds the migration rate, a lower effective population size is expected, and we should see reduced diversity both within demes and in the species as a whole (Slatkin, 1977; Whitlock & Barton, 1997; Pannell & Charlesworth, 2000). These predictions are nicely illustrated by comparisons of the genetic structure for maternally vs bi-parentally inherited genes: the former may retain the signature of colonization if seed dispersal is limited, whereas genetic structure for the latter is eroded by pollen dispersal. Such contrasting patterns have been found in both herbs (e.g. McCauley, 1994, 1997, 1998; De Cauwer et al., 2010) and trees (e.g. Petit et al., 1997).

The above rules of thumb are useful, but they belie the potential demographic complexities of a metapopulation. Even the simplest models that assume demes of similar size and extinction probability include several parameters that all have an important effect on the effective size and genetic diversity, including the migration and extinction rates, the number of demes and their sizes, the number of founding individuals and the extent to which they come from the same source deme or a mix of different demes (Wade & McCauley, 1988; Whitlock & McCauley, 1990; Pannell & Charlesworth, 2000; Pannell, 2003). Because different processes can affect particular summary statistics in similar ways, our challenge is to devise sampling and analysis that allow us to distinguish them, for example, to move beyond the use of single statistics that only summarize part of the pattern.

III. Population differentiation, and how best to measure it

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Because evolutionary dynamics will depend upon past, current and future population genetic structure of a species (Meirmans, 2012), identification of the magnitude of population genetic differentiation has been a central component of biological research. Population differentiation across the genome as a whole may indicate low historical levels of gene flow, whereas differences in population differentiation among traits or loci may point to a history of divergent selection and local adaptation (see next section). Measures of population differentiation are thus arguably more interesting than are estimates of the effective population size of the species, and are more straightforward to estimate.

FST is the most familiar and widely employed measure of population genetic differentiation, not least because it is embedded so deeply in the theoretical population genetics literature (Guo et al., 2009; Whitlock, 2011; Rousset, 2013). Nevertheless, its use and interpretation need care, not least because FST has been defined and derived in ways that are equivalent only under certain assumptions. FST was first introduced by Sewall Wright as an index of inbreeding, to capture the notion that individuals in the same deme tend to be more closely related to one another than to individuals of other demes, and that dispersal or migration among demes will tend to break up these relationships. Because all individual gene copies trace back to a common ancestor (i.e. all individuals are ultimately related), it is natural that an index of inbreeding imposed by population structure should incorporate estimates of co-ancestry within populations and in the population as a whole, as FST does. We should also expect FST, as an inbreeding coefficient, to be expressible in terms of within-deme and species-wide coalescence times. Indeed, Slatkin (1991) showed that

  • display math

where tS and tT are the expected times to coalescence of two genes sampled from the same population and from the global population, respectively; an estimate of this parameter will be accurate when the mutation rate is low.

Whitlock (2011) has usefully referred to this inbreeding, coalescent, perspective of FST as FST,coal to emphasize its distinction from other definitions; Rousset (2013) has labeled it CST. Importantly, FST,coal is defined without reference to gene diversity and is independent of the mutation rate; it depends only on processes that affect the shape of the underlying genealogy, such as dispersal and population size (including fluctuations caused by bottlenecks and expansions). For demographic and evolutionary inference, it thus tends, ultimately, to be FST,coal that we wish to estimate, although it is less useful if we wish to identify demes of especially distinctive allele composition at loci of particular interest for conservation purposes, in which case Jost's D might be more useful (Jost, 2008).

Of course, we can rarely know the history of co-ancestry for a population and thus have to turn to genetic markers for help. This is where difficulties with FST arise, because the distribution of variation at genetic marker loci depends not only on the demographic processes of drift and migration, but also on mutation. For loci with two alleles, FST was defined by Wright (1943) in terms of the variation among populations in allele frequencies. Later, Nei (1973) derived an expression for FST, which he termed GST, that is applicable to loci with multiple alleles. A proper multi-allelic estimate of FST based on allelic variance was introduced by Weir & Cockerham (1984). GST is roughly equivalent to FST as an expression of the variance of allele frequencies; it also approximates FST,coal (and thus the genealogical structure of the population at the sampled loci, as determined by drift and migration) as long as the mutation rate is low.

Because FST is a relative measure of diversity, it presents a number of difficulties for interpretation. First, processes that reduce Ne locally, including background selection in non-recombining regions of the genome or inbreeding, will necessarily lead to high FST at these loci (Charlesworth, 1998). As discussed below, this is particularly relevant for the interpretation of patterns of variation in FST across the genome. In cases in which genetic diversity has been reduced by drift locally, Charlesworth (1998) has thus warned against relative measures of differentiation, such as FST, for genomic regions with different levels of recombination or different mating systems (and thus potentially different Ne) and has argued in favor of the use of absolute measures of differentiation, such as the absolute difference between within- and between-population diversity. A second difficulty occurs when FST is calculated for loci with high allelic diversity, where the upper bound for FST is substantially below one, such that populations that are strongly differentiated (e.g. they shared few alleles) can have low FST values (Jost, 2008). In general, FST can thus be unhelpful as a measure of genetic differentiation, particularly when one is keen to compare different species or different loci that show markedly different levels of genetic diversity (and for which the upper bound for the differentiation statistic differs as a result; Charlesworth, 1998; Hedrick, 2005; Jost, 2008). A related difficulty arises when FST is based on loci with high mutation rates, which potentially obscure the genealogical relationships among individuals because of homoplasy, so that FST fails to reflect the coalescent history of the sample and the demographic processes influencing it. Microsatellites are much more prone to this problem than, for example, SNPs.

New measures of population differentiation have been proposed to address the problem of high allelic diversity, the relative merits of which have been discussed at length (Ryman & Leimar, 2009; Whitlock, 2011; Wang, 2012; Rousset, 2013). Hedrick (2005) introduced a standardized measure, GST, by dividing by the maximum possible value of GST for the observed allele frequencies globally, and Jost (2008) proposed a new measure of population differentiation, D, that has the intuitively appealing property of reaching a maximum when each allele is private to a single population (Jakobsson et al., 2013); new software has been developed to estimate these alternatives to FST (Crawford, 2010; Meirmans & Hedrick, 2011; Winter, 2012). Nevertheless, FST has the advantage over indices such as D that it is a well-defined parameter that is connected to the theoretical literature, allowing demographic and evolutionary inference (Whitlock, 2011; Rousset, 2013).

It is well known that FST can be related to the number of individuals migrating between subpopulations (Nm) according to the equation: inline image (Wright, 1931). As re-emphasized by Whitlock & McCauley (1999), the utility of this expression, however, requires not only that populations have reached drift–migration equilibrium, but also that all demes have the same constant size and equal migration rates. Such conditions are probably rarely met by real species. In a metapopulation, for example, rapid population turnover is predicted to increase genetic differentiation among demes (Wade & McCauley, 1988). Population turnover can be incorporated into models, but one quickly faces the problem of over-parameterization of models that are biologically plausible, even though adequate sampling can overcome some of the difficulties (see Städler et al., 2009).

Estimators formalized around the so-called F-model may provide insights beyond those revealed by FST. The F-model, which is a likelihood-based approach that defines FST as a parameter of the full distribution of allele frequencies (Balding & Nichols, 1995; Nicholson et al., 2002; Gaggiotti & Foll, 2010; Karhunen & Ovaskainen, 2012; Bhatia et al., 2013), accommodates differences in population size and migration rates across a species range (Gaggiotti & Foll, 2010), and thus has advantages over FST which estimates a single ‘global’ value of differentiation. Foll & Gaggiotti (2006) introduced a hierarchical formulation that uses population-specific measurements to obtain priors for FST, and then estimates the proportional contribution of population-specific drift and migration to characterize population genetic structure. For example, to estimate the contribution of extinction and recolonization to population structure, the F-model can be applied to a long-term dataset recording the size and spatial distribution of demes, together with a knowledge of their demographic history and age structure. Jay et al. (2012) recently applied an F-model-based approach to assess how the ecological characteristics of 20 alpine plant species determined each respective population's shared co-ancestry, and allowing a prediction of how future climate change might alter the magnitude and distribution of population divergence.

IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Individuals from different populations of a species often differ phenotypically. To discern whether such phenotypic variation is the result of drift vs natural selection, Spitze (1993) introduced the FSTQST comparison. QST is a phenotypic analog to FST and estimates the additive genetic variation among populations for the trait in question as the total genetic variation among populations divided by the among-population variation plus twice the additive genetic variation within populations. If trait divergence is solely a result of random processes, QST and FST should be similar; by contrast, QST > FST or QST < FST should reflect divergent or globally stabilizing selection, respectively. Yu et al. (2011) recently adopted this approach to understand the basis of variation in flower size and inflorescence variation in the dioecious herb Silene latifolia.

Silene latifolia is sexually dimorphic for a number of traits, the extent to which varies among populations (e.g. males produced smaller, more numerous, flowers than females; Steven et al., 2007; Delph et al., 2011). Yu et al. (2011) measured among- and within-population quantitative genetic variation and covariation in calyx width (the most sexually dimorphic trait), and compared this variation with global FST derived from microsatellite markers. Interestingly, the ratio of phenotypic to neutral genetic differentiation was 4.2 for males and only 0.4 for females, suggesting that selection on only one of the sexes may be largely responsible for the degree of among-population divergence in calyx width – or that selection has taken place on other traits that are genetically correlated with calyx width (Fig. 3). A multivariate form of this classic test, derived by Martin et al. (2008) and Chapuis et al. (2008), allows for the inclusion of such potential among-trait covariances.

image

Figure 3. Among-population sexual dimorphism in calyx width in dioecious Silene latifolia, probably resulting from differential selection predominantly on males. The figure shows the mean (± SE) calyx width of (a) female and (c) male flowers; (b) and (d) illustrate their differences. Additive genetic values were derived from controlled within- and among-population crosses in a glasshouse for genotypes from three populations (VIR, Giles County VA, USA; CRC, Cabo de Roca, Portugal; ZAG, Zagreb, Croatia). (a, c) Significant differences among means in calyx width are indicated with different letters (dams) or numbers (sires) above the means. Differences in FST–QST ratios between males and females suggest that males have been under stronger divergent selection for calyx width than females (ratio of 4.2, as opposed to 0.4 for females). In this study, the comparison between males and females helps to rule out the possibility that recent mutations might have influenced FST and QST differently, because the two sexes provide a control for each other. (Graphs from Yu et al., 2011, with permission; images courtesy of L. Delph.)

Download figure to PowerPoint

The direct FST–QST comparison is useful only when QST estimates the additive genetic component of phenotypic divergence between populations, and is not influenced by phenotypic plasticity. The measurement of traits in the field is thus problematic, because individuals from different populations may express different phenotypes in response to environmental cues (e.g. see Pujol et al., 2008; Whitlock & Guillaume, 2009). Ideally, plants need to be measured growing in a common garden or glasshouse, as in the study of S. latifolia by Yu et al. (2011). However, Antoniazza et al. (2010) proposed an approach by which the distribution of among-population variation in phenotypes measured in the field, or PST, might be substituted for QST. Conclusions from such an analysis must still remain somewhat limited, but the use of independently derived estimates of trait heritability may alleviate much of the concern presented by the PST approach (Antoniazza et al., 2010).

Because FST–QST comparisons clearly require estimates of FST, its limitations are also relevant here. For example, because FST is biased downward for loci with higher mutation rates, we ought to expect an enrichment of studies inferring divergent selection among populations based on microsatellites compared with isozymes (Edelaar et al., 2011). To deal with this issue, Edelaar et al. (2011) suggested the use of an estimator of neutral genetic divergence that corrects for molecular marker heterozygosity, such as GST, or DEST (although theory that relates these statistics to QST still needs to be developed) or of estimators that are not affected by the mutation rate. Another solution is simply to avoid the problem by estimating FST using markers with lower mutation rates, such as SNPs or allozymes (Edelaar et al., 2011).

Ovaskainen et al. (2011) and Karhunen & Ovaskainen (2012) have recently taken a new perspective on the standard FST–QST comparison (Fig. 4). Their approach uses an extended F-model-based estimator, the admixture F-model (AFM), to construct a matrix of population co-ancestry, simultaneously disentangling the role of local drift and gene flow on the basis of population-specific deviations in allele frequencies. The parameters of the co-ancestry matrix can be used to estimate population means for quantitative traits under neutral expectations. Because the full matrix of population associations is accounted for, the effects of drift and selection can also be identified even if inline image. This method, which bolsters the continued relevance of FST, will be ideal for understanding the role of neutral and selective processes in species range expansions and biological invasions (Fig. 4).

image

Figure 4. An application of the method of Ovaskainen et al. (2011) and Karhunen & Ovaskainen (2012) to a simulated dataset of three populations (colored symbols) measured for two traits undergoing directional selection. The inferred ancestral population mean for the simulated traits is located at ‘A’. Neutral genetic data were simulated for 18 microsatellite markers, with a global FST set to 0.10. Simulations assumed that the red population was under strong directional selection, the green population under weak selection and the black population under no selection. The ellipses depict the 50% probability sets for a given population under the effects of drift only. The results indicate that population-specific selection histories can be revealed by this approach, even for traits that are partially correlated. Simulations, as well as the determination of evidence for divergent selection, were conducted using R v. 2.15.2 (R Development Core Team, 2011).

Download figure to PowerPoint

V. Inferring local adaptation: neutral vs selected genes

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Genetic differentiation between populations can be the result of differential selection among habitats or of genetic drift, potentially enhanced by demographic processes (Luikart et al., 2003). Because demographic processes affect all loci, whereas selection should affect only loci responsible for fitness or closely linked loci, comparisons among loci provide a potentially revealing way to distinguish between the two types of process. One useful approach involves comparison among loci for outliers in their FST against the full, observed distribution of FST across the genome (Lewontin & Krakauer, 1973; Beaumont & Nichols, 1996; Prunier et al., 2012). Outlier loci with unusually high FST are then candidates for genomic regions involved in local adaptation. Another idea is to generate a null distribution for genetic differentiation across the genome by simulating differentiation statistics for independent loci conditioned on the heterozygosity actually observed, for example, using a coalescent framework (Thornton & Jensen, 2007). This approach is particularly useful for cases in which the number of loci in a dataset is modest (i.e. where the data are insufficient to provide a robust distribution), but it has the drawback that incorrect demographic models can lead to the identification of incorrect loci (Oetjen & Reusch, 2007).

It is important to recall that processes other than local adaptation can give rise to variation in FST across the genome. For instance, Charlesworth (1998) suggested that high FST observed in regions of low recombination in Drosophila could be accounted for by processes such as background selection and may have nothing to do with local adaptation. Similarly, in a recent study of patterns of genomic divergence among four species of sunflower, Renaut et al. (2013) found that genomic islands of high divergence did not correspond to lower effective gene flow (as might have been expected for loci under selection for local adaptation); rather, because such regions tended to also have low recombination rates, forces reducing Ne were more likely to be responsible.

Another general approach that contrasts with FST scans is to identify candidate loci by seeking associations between allele frequencies and particular habitats (or across environmental gradients), with comparisons made against the distribution of allele–habitat associations over the whole genome (Hedrick et al., 1976; Bierne et al., 2011). Genetic–environment correlations have been particularly useful in determining the co-variation of particular allelic variants and climatic variables, thereby pointing to possible genes responsible for adaptation to variation in temperature, moisture availability or variables that co-vary with latitude and longitude. A particular revealing example is provided by Eckert et al.'s (2010) study of genetic–environment correlations across the species range of loblolly pine (Pinus taeda) in North America.

In their study of loblolly pine, Eckert et al. (2010) sought associations between genetic loci and environmental variation across the entire species range. By statistically removing correlations caused by co-ancestry and range expansions, the authors identified five loci that were correlated with aridity gradients. All five loci are known to have stress-related functions in Arabidopsis thaliana, but, revealingly, none of these loci were among the 24 loci identified from the same dataset through FST outlier analysis. More recently, Frichot et al. (2013) applied a similar approach to a subset of the same dataset, and identified genes associated with wound repair and immunity, photosynthetic activity, carotenoid biosynthesis, cellular respiration, carbohydrate metabolism, and responses to heat, salt and oxidative stress. The approach taken in these studies is powerful, because it is capable of revealing even small environmental correlations for loci, specified a priori, that are likely to be targets of selection along gradients. Selection on quantitative traits can bring about large changes in phenotypes as a result of only small changes in allele frequencies at many loci (Barton & Turelli, 1989), and it is satisfying that these sorts of associations can still be found with appropriate sampling.

Analysis of genetic–environmental associations can point to particular environmental factors that might have been involved in the selective process. For example, in a study of local adaptation in pines, this time with the Mediterranean conifers Pinus pinaster and P. halepensis, Grivet et al. (2011) identified different sets of genes in each species as likely targets of selection (with only one locus in common between the two). Surprisingly, they identified temperature as having been the most probable driver of selection as opposed to, for example, precipitation, which one might suspect of being important in a Mediterranean climate. Another revealing example is provided by data from balsam poplar (Populus balsamifera). Keller et al. (2012) combined tests that identified FST outliers (BayeScan, Foll & Gaggiotti, 2008; Arlequin, Excoffier et al., 2009) with tests for significant correlations between allele frequencies and environmental factors (Coop et al., 2010; Günther & Coop, 2012). Although their study identified 14 genes that showed signatures of local adaptation, only two showed statistical significance for both FST outliers and an association with one or more environmental variables (Fig. 5). This sort of inconsistency can be revealing, because it brings into sharper focus model assumptions and points to the most effective methodology for the determination of the loci responsible for adaptation to heterogeneous environments. Only a few such studies have been performed to date, but it is becoming clear that methods correlating allele frequencies with environmental variables are likely to be revealing (Schoville et al., 2012; De Mita et al., 2013). Such methods can be made more robust to the residual effects of demography and population structure through the use of Latent Factor Mixed Models, which estimate and remove the effects of unknown hidden factors (Frichot et al., 2013).

image

Figure 5. A comparison of three popular methods for the detection of loci under divergent selection in Populus balsamifera. Two of the methods, BayeScan (Foll & Gaggiotti, 2008) and the Hierarchical Model in Arlequin (Excoffier et al., 2009), attempt to detect signatures of local adaptation with FST-based outlier analysis, whereas the third method, Bayenv (Coop et al., 2010), tests for significant associations between particular alleles among loci and environmental variables. Here, 443 individuals were sampled from 31 populations across the species North American range and were genotyped (a) at 412 reference single nucleotide polymorphisms (SNPs) known to be neutral and to generate a null distribution of comparison, and (b) for 339 candidate selected loci for geographically variable selection on 27 homologs of the Arabidopsis flowering time network. There were a total of 43 SNPs, from 14 candidate genes, showing signatures of local adaptation, but only 10 were consistently identified by all three programs. The methods varied in their propensity to generate false positives, with the method instantiated in the program Arlequin showing proportionally larger rates. (Modified line drawings courtesy of S. Keller.)

Download figure to PowerPoint

In principle, genetic markers associated with environmental variables or FST outliers may simply be in linkage disequilibrium (LD) with a selected locus and not be the locus itself. Encouragingly, it has become clear that LD around selected loci decays relatively rapidly with distance along the chromosome (Barton, 1979; Barton & Bengtsson, 1986), even for selfing species in which gametic disequilibrium decays more slowly (Nordborg et al., 1996). Markers identified by the approaches reviewed here may thus often be very close to a selected locus, particularly if divergent selection among different habitats has been strong over long periods of time, and as long as an appropriate model of subdivision or demographic history has been assumed (because subdivision increases directly gametic disequilibrium; Bierne et al., 2011). This explains why it was possible to identify plausible genes under selection for local adaptation in the Pinus species cited above, as well as loci implicated in local adaptation to serpentine soils in Arabidopsis lyrata (Turner et al., 2008, 2010).

VI. Effects of subdivision on inbreeding and inbreeding depression

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Because plants tend to mate locally, population subdivision will tend to increase the rate of inbreeding. As a consequence of the corresponding increase in homozygosity, the increased expression of recessive deleterious alleles can lead to the short-term purging of inbreeding depression within populations (Thrall et al., 1998; Whitlock, 2002). However, over the longer term, drift can overcome selection in small and isolated demes, fixing deleterious alleles locally. This reduces not only the mean fitness of the population, but also the level of inbreeding depression, because inbred and outbred individuals all express the same fixed deleterious recessive mutations. Variations in inbreeding depression in the plant Mercurialis annua are probably explained by such effects of drift in small populations, in this case following species range expansion (Fig. 6). Because the Iberian Peninsula was recolonized by M. annua from the south, northern populations are less genetically diverse (Obbard et al., 2006) and show substantially lower inbreeding depression than southern ones (Fig. 6). It is not known whether these populations have also fixed their genetic load of deleterious mutations, but genome-wide molecular data might allow such effects of drift to be detected, for example, through the detection of fixed differences at non-synonymous sites.

image

Figure 6. (a) Inbreeding depression in hexaploid Iberian and Moroccan populations of the European annual plant Mercurialis annua is highest in the south (Morocco), the putative refugium of the species, and much reduced in northern populations towards the range boundary in the Iberian Peninsula. (b) An experiment to assay inbreeding depression in diploid populations of M. annua in northern Spain. Diploid M. annua, which expanded its range into the Iberian Peninsula from the north and east, expressed low but variable inbreeding depression at its range boundary (Eppley & Pannell, 2009), as did its hexaploid counterparts (Pujol et al., 2009). (Graph modified from Pujol et al. (2009), with permission. Image in (b) courtesy of S. Eppley.)

Download figure to PowerPoint

The effects of drift on inbreeding depression and the fixation of genetic load in small isolated populations, predicted by a negative association between inbreeding depression and FST (Fig. 7), can ultimately only be reversed by migration among demes. This process of ‘genetic rescue’, which has been documented for both plants (e.g. Willi & Fischer, 2005; Willi et al., 2005) and animals (e.g. Saccheri et al., 1998; Ebert et al., 2002; Escobar et al., 2009), has important implications for both our understanding of the genetic architecture of population subdivision, as well as conservation. For instance, managers of threatened species have tended to advocate the sourcing of seeds for species introductions from populations as nearby and genetically similar as possible. A recent study by Pickup et al. (2013) indicates that this may often be misguided. These authors analyzed the fitness of individuals produced by artificial crosses within and between populations of the Australian perennial herb Rutidosis leptorrhynchoides, and found that fitness was increased by crossing with individuals from large, genetically diverse populations that were not necessarily local (Fig. 8). Importantly, the consequences of crosses will often vary among populations, and conservation efforts should often be framed in a metapopulation context, bearing in mind differences in population age, size and history of migration.

image

Figure 7. Inbreeding depression expected in subdivided populations expressed as a ratio of that expected for an undivided species with the same parameters. Here, inbreeding depression is expressed as the fitness of inbred individuals relative to other members of the same local populations. The relative role of population subdivision in lowering inbreeding depression depends on whether selection is soft (broken line) or hard (solid line). When most deleterious alleles are assumed to be recessive, population subdivision will significantly reduce the total genetic load. (After Whitlock (2002), with permission.)

Download figure to PowerPoint

image

Figure 8. The change in fitness between control and F2 individuals of the perennial herb Rutidosis leptorrhynchoides from crosses with pollen donor populations that varied in their effective number of alleles at microsatellite loci. Crossing target individuals with those from large populations with high genetic diversity produced greater increases in fitness than crossing them with individuals from small, genetically depauperate populations. Fitness was measured as the mean number of inflorescences, or flower heads. R2 = 0.43, = 0.012. (Figure modified from Pickup et al. (2013), with permission; image courtesy of A. Young.)

Download figure to PowerPoint

In any finite population, the relatedness between pairs of mating individuals will vary: the higher the relatedness, measured by the pairwise inbreeding coefficient F, the more likely it is that progeny will be homozygous at a given locus (Wright, 1932). The associated increased expression of deleterious recessive alleles, or (probably less commonly) the expression of overdominant loci, will then cause lower fitness in the progeny of parents with higher F (Charlesworth & Willis, 2009), that is, we expect a positive relationship between the expression of inbreeding depression and F among individuals in a population. In the most classical sense, inbreeding depression is thus characterized by an inverse relationship between F and an individual's fitness, or that of a pair's offspring. Inferences derived from F can be made directly (through molecular genotyping) or indirectly (using controlled crosses).

Although there is much evidence for inbreeding depression in natural populations, the expectation of an inverse relationship between fitness and F for a given parental pair has not been supported by a number of recent empirical experiments. The discrepancy might be attributed to processes associated with population structure. For example, in the plant Ranunculus reptans, inbred offspring may be equally fit or fitter relative to other individuals, and the relationship between F and fitness varies between populations (Willi et al., 2005). Similarly, a five-generation serial inbreeding experiment with Mimulus guttatus showed that the relationship between total flower production and the degree of inbreeding varied significantly among populations and families (Dudash et al., 1997). In this experiment, the higher extinction probability of inbred lines meant that the purging of genetic load could be accomplished more readily by selection among lines, rather than selection among individuals within lines.

The causes of variation in the F–fitness relationship among populations or families must include some variance in the distribution of recessive, or nearly recessive, mutations, brought about by variance in population age, demographic history, genetic drift, founder effect, historical gene flow, bi-parental inbreeding and other past opportunities for the purging or fixation of deleterious recessive alleles. These and related processes can occur at very local scales, even within continuous populations, and may feed back to affect dynamics associated with population structure, such as gene flow, population growth and persistence. Such variance in inbreeding effects therefore reflects the effects of population structure in the inbreeding process.

One study on the perennial rosemary scrub (Hypericum cumulicola) has suggested that population size, age and isolation will have a significant effect on the determination of the fitness consequences of certain crosses (Fig. 9). In order to determine population characters known to affect the consequences of inbreeding, Oakley & Winn (2012) combined molecular marker-based estimators of migration and estimates of relative effective size for several natural populations of H. cumulicola. In a study that resonates with that of Pickup et al. (2013) of Rutidosis leptorrhynchoides, cited above, fitness assays conducted on the products of hand pollinations of within- and among-population crosses (including crosses with self-pollen, outcross pollen from a different individual within the same population and outcross pollen from each of two different subpopulations) revealed that heterosis was significantly greater for small populations relative to large ones, and inbreeding depression tended to be smaller (Fig. 9).

image

Figure 9. Magnitude of inbreeding and heterotic effects in natural populations of Hypericum cumulicola. (a) Spatial arrangement of 16 natural populations located in southern Highlands County, FL, USA (distance in m), which were censured in 2007 and 2011. Populations containing 11–25 individuals were characterized as small, whereas populations with 124 to > 1000 individuals were considered as large. Population size is indicated by column height, and column labels are population identifiers. Arrows between populations indicate estimates of gene flow, wherein the weights of the arrows are indicative of the quantity of gene flow, estimated using microsatellite markers and the software MIGRATE-n v.3.2.6 (Beerli, 2009). (b) Family mean cumulative fitness (± SEM) over 2 yr resulting from self-pollination, pollination by a different individual within the same population and pollination by an individual from one large and one small population.

Download figure to PowerPoint

VII. Current technologies – from genome sequencing to RAD-tags

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

Advances in sequencing technologies are creating new opportunities for the testing of hypotheses using genomic information in both well-established models, such as A. thaliana, and uncharacterized systems, and at new scales of biological organization (Horton et al., 2012; Sloan et al., 2012). Arabidopsis thaliana started out as a model for the study of developmental and molecular genetics, but has now been accepted as a model by a broader research community, including ecologists and evolutionary biologists (Fig. 10). This contrasts with the sneers one used to hear in bars at ecology and evolution conferences that ‘Arabidopsis thaliana is not a real plant’! Indeed, A. thaliana and its relatives offer us a broad set of genetic tools for the genomic analysis of population subdivision (Gaut, 2012), as illustrated by the increasingly large multi-population, partial or whole-genome scans for polymorphism (Aranzana et al., 2005; Nordborg et al., 2005a; Bakker et al., 2006; Clark et al., 2007; Nordborg & Weigel, 2008; Atwell et al., 2010; Platt et al., 2010; Fournier-Level et al., 2011; Horton et al., 2012). The application of these tools in A. thaliana has revealed signatures of population demographic history, subdivision and selection that were hitherto not possible.

image

Figure 10. Phenotypic variation among populations (‘accessions’) of the annual herb Arabidopsis thaliana, grown under uniform conditions. The array shows individuals photographed at the rosette stage. (Image courtesy of D. Weigel.)

Download figure to PowerPoint

Although early studies of population structure across the range of A. thaliana failed to find much evidence for isolation by distance, such patterns have now been revealed by the analysis of whole-genome resequencing (Nordborg et al., 2005b; Platt et al., 2010). High-throughput analyses have also revealed a characteristic signature of reduced diversity at high latitudes indicative of bottlenecks associated with range expansion (Lewandowska-Sabat et al., 2010), as has been found in other species (Hewitt, 2000). The reduction in genetic diversity during range expansion has implications for both the expression of inbreeding depression, as well as the responsiveness of populations to selection (Pujol & Pannell, 2008). It is thus possible to interpret patterns of genetic diversity in A. thaliana in the context of what we can infer about the demographic and phylogeographic history of the species.

Increasingly detailed analysis of the structure of genomic variation over the geographic distribution of A. thaliana has revealed evidence for selection on a wide range of traits. Horton et al. (2012) recently genotyped a global sample of over 1000 individuals of A. thaliana using a 250 000 SNP chip. They used statistics that measured the haplotype structure across the genome and the allele frequency spectrum, allowing distinction to be made between past and ongoing selection and between selection on new mutations and standing genetic variation (Maynard Smith & Haigh, 1974; Nielsen et al., 2005; Toomajian et al., 2006). They found that FST was the only statistic that pointed to selection on defense-related regions of the genome, a pattern inconsistent with a model of repeated selective sweeps on defense genes and more consistent with long-term balancing selection at these loci (Bakker et al., 2008). FST scans also revealed population differentiation in genomic regions associated with flowering time, a trait expected to be under differential selection in different environments (Horton et al., 2012). A recent selection study of A. thaliana using reciprocal transplant experiments across Europe (Lowry, 2012) has revealed direct evidence for local adaptation, particularly implicating freezing tolerance, and it can be anticipated that the application of genomic tools to the genetic material produced by this study will point to where in the genome, and how, selection has acted over short periods of time.

Much of the progress made with A. thaliana as a model has been facilitated by the availability of a reference genome (Kaul et al., 2000), but, for many plants that have particularly large and repetitive genomes, full genome sequencing is still not economically feasible. Nevertheless, many questions can now be addressed through the application of genotyping-by-sequencing (GBS) approaches, which allow for a targeted fraction of the genome (or a reduced representation library) to be sequenced; these include the use of targeted restriction enzymes in order to reduce genome complexity, capture probes or transcriptome-based analysis (Davey et al., 2011; Narum et al., 2013). GBS allows the genetic analysis of species with little or no genomic information and with a full range of genome sizes (Narum et al., 2013), and is rapidly being adopted to address questions across a range of taxa (see Molecular Ecology's special issue, Genotyping by Sequencing in Ecological and Conservation Genomics).

Currently, the most rapidly advancing GBS approach is perhaps restriction-site-associated DNA sequencing (RAD-tag or RADseq, Baird et al., 2008; although also see Elshire et al., 2011). The RAD-tag approach involves a genome-wide survey of nucleotide diversity of regions flanking restriction sites, and allows the simultaneous detection and genotyping of thousands of genome-wide SNPs (Wagner et al., 2013). The high costs of multiplexing prevented the genotyping of population or pooled samples for initial iterations of the method, but emerging pipelines, such as double-digest RADseq (ddRADseq), now allow cheaper polymorphism discovery and genotyping for large samples by multiplexing digested samples (Peterson et al., 2012). Currently, ddRADseq offers the most feasible approach for the generation of the genomic data necessary for inferences about population structure, especially when its consequences (such as local adaptation) are not extreme, although the possibility of allele dropout can compromise its potential (Gautier et al., 2013). Pipelines for the analysis of data derived from RADseq are beginning to be published (e.g. STACKS, Catchen et al., 2011; UNEAK, Lu et al., 2013). With these sequencing and analytic tools, a combination of ddRADseq and genetic–environment correlations offers an effective framework for future studies of the consequences of population subdivision in plants.

VIII. Whither now – new wine in old skins?

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

The study of the evolution of local adaptation in plants has a long history, going back to investigations that have become classics at a wide range of spatial scales (e.g. reviewed in Kawecki & Ebert, 2004). Most of this work focused on phenotypes, but current technologies are opening up new possibilities for working on the genetics of local adaptation for species with larger and more difficult genomes than that of A. thaliana, including polyploids. These studies will include systems that have not been investigated previously, but much might also be gained from the use of new approaches to rake over the coals of earlier work, for which fundamentals have already been worked out. The evolution of heavy metal tolerance on contaminated mine tailings in the grass Anthoxanthum odoratum (e.g. Antonovics et al., 1971; Brandon, 1990) provides a good example of a system that might now benefit from revisiting, but there are of course many others. The research program on A. odoratum is exemplary, because it represents a case study in which several key milestones have been achieved in understanding the genetic basis of adaptation and speciation. These are loosely enumerated in Table 1, following Brandon (1990). Table 1 also summarizes the details of how these milestones were reached in the study of A. odoratum specifically.

Table 1. The key milestones that ideally need to be reached to understand the evolution of local adaptation by selection in a heterogeneous environment, as listed by Brandon (1990), and their realization in the classic model system for evolution of heavy metal tolerance in Anthoxanthum odoratum
Milestones in generalMilestone in A. odoratumReferences
Measurement of differential fitness of variable traits under the environmental conditions of interestEvidence gathered by conducting reciprocal transplants of individuals sampled in locations known to either be contaminated by heavy metals or not. Selection coefficients between different soil types were as high as 0.7Antonovics (1966), Antonovics et al. (1971)
Providing an ecological/physiological explanation of the selection and verifying its operation in natureMeasurement and manipulation of the metal content of soils and determining its effect on root growthAntonovics et al. (1971), Brandon (1990)
Documenting the inheritance of the selected traitTargeted crosses among individuals sampled from different soils suggested that metal tolerance was inherited with partial dominance, without maternal effects, and that the trait is likely to be polygenicGartside & McNeilly (1974)
Characterization of patterns of gene flow among populations and spatiotemporal variation in selectionThe description of clinal variation in a number of morphological characters along a fine-scale transect between soil types that limit gene flow (thus isolating populations on different soils)Antonovics & Bradshaw (1970)
Inferring trait polarity on the basis of phylogenetic information to show that it is indeed derived in the habitat of interestDiscovery of tolerant individuals in multiple species growing on contaminated soils that were probably derived from non-tolerant populations. Selection experiments revealed rapid evolution of tolerance, but the origin of tolerance genes was not clearAntonovics et al. (1971), Brandon (1990)

Early reciprocal transplant experiments with A. odoratum revealed strong selection acting between metal-tolerant and metal-sensitive genotypes under field conditions, and manipulation of the soil environment demonstrated that this was caused specifically by the effects of the heavy metals, that is, points 1 and 2 in Table 1 were well covered by ecological experimentation. It is in points 3–5 that classical methods have only provided a partial understanding of the details of the evolution of heavy metal tolerance in this species. To some extent, variation in tolerance among individuals and populations has been attributed to gene flow among populations on different soils, but much remains to be learnt about which genes and genome segments are involved in the adaptation, and about their fate in contaminated and uncontaminated environments. Although the widespread distribution of non-tolerant phenotypes suggests that tolerance is a derived trait (Antonovics, 1966; Antonovics et al., 1971; Brandon, 1990), direct phylogenetic evidence for the polarity of adaptation is still lacking, and it is not known whether tolerance evolved independently in some populations or whether it evolved once and was then successfully exported to other populations.

As with many plant systems of ecological and evolutionary importance, there are still only limited molecular marker resources for A. odoratum, which include organellar gene sequences and genome-wide amplified fragment length polymorphisms (AFLPs) (Freeland et al., 2010, 2012). The application of ddRADseq could now help to advance our understanding of this classic example of local adaptation. For instance, signals of selection could be exposed by genotyping individuals across environmental gradients of soil types using ddRADseq or other de novo genotyping approaches, combined with the analysis of outlier and genetic–environment associations and the genotyping of mapping populations produced by crosses. One could envisage follow-up reciprocal transplant experiments among soil types that targeted SNP polymorphisms. Candidate gene analysis might then lead to a deeper knowledge of the functional, metabolic and regulatory roles of genes responsible for metal tolerance. Because full tolerance to heavy metals is not observed in all populations of A. odoratum at contaminated sites, there has probably been a complex history of selection and migration acting on the species; such complexities could be understood by the use of a combination of molecular population genetic methods to estimate long-term (MIGRATE, Beerli, 2009; TreeMix, Pickrell & Pritchard, 2012) and short-term (BayesAss, Wilson & Rannala, 2003; BIMr, Faubet & Gaggiotti, 2008) gene flow.

The example of A. odoratum illustrates the sort of ecological and evolutionary questions likely to benefit from the application of GBS methods. Some time ago, David (1998) outlined an ambitious plan to better understand the presence/absence of heterozygosity–fitness correlations (HFCs) in natural populations. As described above, HFCs are probably caused by the masking of recessive deleterious alleles in heterozygotes (Lynch & Walsh, 1998). However, direct determination of the genomic regions responsible for HFCs, and inbreeding depression more generally, remains elusive. Application of GBS to experimental populations using new genomic tools will probably allow us to detect the loci responsible for inbreeding and outbreeding depression, and to determine the distribution of the relevant alleles among populations. Ward et al. (2013) recently applied GBS methodologies, as well as a novel genome-independent imputation pipeline, to deal with missing or erroneous data, in order to generate a linkage map of the plant species complex Rubus idaeus (which includes red raspberry and blackberry). High-density genotyping of the progeny of controlled crosses allowed the identification of genomic regions exhibiting segregation distortion that might be responsible for inbreeding depression. Importantly, these insights were gained without a reference genome, in a fraction of the time and costing much less than analyses relying on more traditional sequencing approaches.

IX. Conclusions

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

The exhilarating progress that is currently being made in the generation and analysis of genomic data poses both opportunities and serious challenges for our understanding of the evolution of plants across heterogeneous habitats. It is worth stepping back to consider what can really be gained from these new tools and data beyond descriptions of diversity and its distribution, and what might be lost. Casting our eyes back over the several past decades of ecological genetic work based on earlier genetic markers is inspiring, but it also provides salutary lessons.

The advent of molecular polymorphic markers, such as isozymes/allozymes, microsatellites, AFLPs, etc., opened up ways to understand genetic diversity and to use it to test ecological and evolutionary hypotheses. There are many examples of what could be done. However, the availability of these markers also prompted a great deal of relatively uninspiring work that simply described differentiation among populations in terms of FST and other statistics, often interpreting patterns in terms of inappropriate evolutionary models. Much of this work is still valuable for large-scale comparative analyses and meta-analysis, but one is left with the feeling that so much more might have been achieved than just describing patterns of differentiation and misinterpreting them in terms of simple models of gene flow.

The opportunities provided by the new tools and data are alluring, but there are also substantial technical challenges. As algorithms and pipelines become available to deal with the data, it will be important to remember that the processes that gave rise to the observed patterns were potentially complex and may be inadequately interpreted using simple models. Importantly, it can be hoped that we will be able to go beyond mere description of patterns and to use the new tools in creative ways to test ecological and evolutionary hypotheses, in both comparative and experimental settings. Here, a knowledge of evolutionary theory and population genetics will be needed to form the basis of sampling strategies. A sound understanding of the genealogical structure of populations, the locus dependence of gene flow and effective population sizes, and the time scales over which statistical associations between loci break down through migration and recombination, will be as important as ever.

In this review, we have briefly discussed the controversy over the utility of FST as a measure of differentiation, but have also highlighted its advantages as a statistic with a sound grounding in theoretical population genetics, notwithstanding several caveats. It is nonetheless important to bear in mind that FST at any one locus has a large evolutionary variance, so that loci under weak selection will tend to escape detection by genome scans. Given the likely widespread importance of polygenic traits in local adaptation, this is a potentially serious limitation to the utility of FST. It is therefore exciting to note recent theoretical developments that reveal high power to detect selection on polygenic traits in terms of co-ordinated shifts in allele frequencies associated with habitat variation, which would be completely undetectable at individual loci (Turchin et al., 2012; Berg & Coop, 2013). We can now look forward to the application of these and perhaps other approaches for understanding the evolution of subdivided plant populations.

Acknowledgements

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References

We thank Janis Antonovics, Kent Holsinger, Stephen Keller, Santiago Martinez-Gonzalez, Lou Jost, Douglas Taylor (and all of the Taylor laboratory), Severine Vuilleumier and an anomymous reviewer for their helpful comments on the manuscript. The opinions expressed are, however, our own. J.R.P. is funded by the Swiss National Science Foundation and the University of Lausanne. P.D.F. is funded by the National Science Foundation (NSF) DEB #0919335 to Douglas R. Taylor and Janis Antonovics, and NSF-OISE# 1139716 to Douglas R. Taylor and Peter D. Fields.

References

  1. Top of page
  2. Abstract
  3. I. Introduction
  4. II. Effective population sizes, genetic drift and migration
  5. III. Population differentiation, and how best to measure it
  6. IV. FST as a basis of inferring local adaptation: neutral genes vs phenotypes
  7. V. Inferring local adaptation: neutral vs selected genes
  8. VI. Effects of subdivision on inbreeding and inbreeding depression
  9. VII. Current technologies – from genome sequencing to RAD-tags
  10. VIII. Whither now – new wine in old skins?
  11. IX. Conclusions
  12. Acknowledgements
  13. References