(i) Some molecular markers are inherently better than others
The field of molecular ecology is rife with simplistic statements that one class of marker is more sensitive to population structure than another class. This misconception is most sharply apparent with claims that mtDNA (or any haploid inherited organelle) will show population divergence first in recently divided populations due to higher levels of genetic drift, or that microsatellites will show divergence first due to high mutation rates and heterozygosities. Both can be true in individual circumstances, depending on a complex array of conditions that include genetic diversity, genetic effective population size (Ne; i.e. the size of an idealized population that would experience the same amount of drift as the real population), mutation rate (μ) and migration characteristics, as well as sex-biased dispersal. No class of markers, however, is a priori more sensitive (i.e. is better able to detect population differentiation) under all conditions.
Under typical conditions of ongoing population divergence, mtDNA always has more power to detect population divergence than any single nuclear locus, but two or more polymorphic nuclear loci are expected to be more sensitive than mtDNA (Larsson et al. 2009). These findings are based on simulations in powsim, a software package that estimates the level of population divergence that can be detected with a given number of loci and sample size (Ryman & Palm 2006; Ryman et al. 2006). One important caveat is that diversities among markers in these simulations are held to be identical. A polymorphic mtDNA locus can have more power than a cluster of microsatellite loci depending on overall diversity in these markers, which will vary among species and evolutionary histories.
While it is clear that loci with low diversity have limited power to resolve differences, it is also true that extremely high diversity can limit the power to detect population divergence. It is a mathematical certainty that high heterozygosity depresses FST values as demonstrated by Hedrick (1999). In addition, microsatellite loci can contain alleles that are identical in size (state) but not by descent (O’Reilly et al. 2004). The step-wise mutation model that predominates in microsatellite evolution produces a downward bias in estimates of population structure (by size homoplasy), relative to a marker evolving by the infinite allele model (Estoup et al. 2002). This effect will be most pronounced under scenarios of large population size (Ne >106) and high mutation rate (μ >10−3). The effect of high levels of allelic diversity on statistical power is not limited to microsatellites. For example, a survey of highly polymorphic mtDNA control region sequences in Pacific cod did not detect genetic partitions (Liu et al. 2010) that were apparent with less polymorphic mtDNA coding sequences (Canino et al. 2010).
Empirical data sets confirm that either mtDNA or microsatellites can detect population divergence not apparent in the other class of markers. Results for benthic (bottom dwelling) marine organisms are informative here because dispersal is accomplished almost exclusively through larvae, while juveniles and adults rarely move more than 1 km in a lifetime. Here, we can set aside concerns about sex-biased dispersal (and small population size in most cases), and ask how the inheritance of mtDNA and microsatellites shapes the magnitude of population divergence. A review of the literature on reef fishes shows that, in some cases, mtDNA and not microsatellites will demonstrate more divergence and in other cases the opposite is true. In an extreme example, a survey of microsatellite variation in the surgeonfish, Zebrasoma flavescens, detected seven populations and significant isolation by distance in the Hawaiian Archipelago (F′SC = 0.026, P < 0.001), while the parallel mtDNA survey showed no significant differences (ΦSC = 0.002, P = 0.38; Eble et al. 2011). Clearly, both mtDNA and microsatellites can be more sensitive for detecting population divergence, and this is borne out in both theoretical (Larsson et al. 2009) and empirical studies (Eble et al. 2011).
It is now possible to interrogate tens of thousands of single nucleotide polymorphisms (SNPs) and to produce incredibly large data sets to search, for example, for genes under selection associated with adaptive traits (Hohenlohe et al. 2010). While SNPs aptly facilitate genomic scans, they must be used cautiously to estimate gene flow, effective population size, genetic diversity and evolutionary mechanisms, because SNPs are often embedded in DNA segments with an unknown genetic background. Methods that survey sequence variability, rather than single nucleotide positions, are still recommended to answer many of the classical questions in population genetics that require estimates of genetic diversity, gene flow or historical and contemporary population sizes. Clearly it is not defensible to make blanket statements about the utility of one genetic marker over another (also see Schlötterer 2004 review). To evaluate the optimal markers for a particular study, much more than the mode of inheritance or mutability needs to be considered. Pertinent information will include locus diversity, available sample sizes, and the level of population divergence. Of course most of this information is only available once the laboratory aspect of the study has begun. However, the versatile molecular ecologist can adjust study design in response to these considerations. For example, a researcher who finds deep (or diagnostic) mtDNA divergences between populations might shift the nuclear DNA analysis from microsatellites to the less variable intron sequences, a more appropriate choice for molecular evolutionary separations.
(ii) mtDNA produces higher FST values than nDNA
The calculation of FST and its analogues (ΦST, F′ST, GST, θ, RST) is surprisingly complex, and the appropriate choice of a F-statistic depends heavily on the level of genetic diversity (Waples & Gaggiotti 2006; Holsinger & Weir 2009; Bird et al. 2011). In particular, parametric FST has a downward bias in cases of high allelic diversity (typical of microsatellite loci). This can be corrected in a variety of ways (e.g. F′ST) by calculating the upper limit for the F-statistics in each case, and scaling that range to fit the usual F-statistic range of 0.0–1.0 (Hedrick 1999; Meirmans & Hedrick 2011). Notably, ΦST, which takes sequence divergence into account, is usually larger than FST, except in special cases where deeply divergent lineages are distributed among populations, or where all haplotypes or alleles are equidistantly related (Bird et al. 2011).
During differentiation of two populations under ideal conditions (equal sex ratio, equal and low levels of migration, random mating within populations, no mutation and no selection), simulations show that the ratio (R value) of mtDNA FST to nuclear FST ranges from R = 1.0–4.0 (Larsson et al. 2009). That means the F-statistics range from equality to four times higher in mtDNA. Examples of this range of R values are abundant in the literature (Table 1). During divergence between populations without migration both mtDNA and microsatellites theoretically start with FST = 0.0 at time 0, and both end with FST = 1.0 at equilibrium (typically after thousands of generations). It should be noted, however, that though the maximum FST is 1.0 at equilibrium, values at time 0 vary stochastically from 0.0 due to sampling effects at the time of subpopulations division. At equilibrium, both markers (if adjusted for heterozygosity) yield equivalent FST values, and values during the intervening period will generally be higher for mtDNA, but the approach to equilibrium depends on the degree of population substructure, the local deme effective population size and migration rate between those demes (Whitlock & McCauley 1999). Simulations by Larsson et al. (2009) show that during the march towards equilibrium, R = 4.0 initially, 1.6 in generation 200 and 1.0 in generation 1000.
Table 1. Cases in which F-statistics for mtDNA are lower, equivalent, and higher than F-statistics for microsatellites (μsatDNA), ranked by R values (mtDNA FST/microsatellite FST). Note that R values far exceed the theoretical range of 1 to 4 in cases where sex-biased dispersal has been demonstrated. Some comparisons are made between regional groups (FCT) rather than individual samples. The FST analogue is specified in each case. When comparing F-statistics, at least two biases are apparent: FST will usually be lower than ΦST for the same data set, and FST is biased downward relative to corrected F′ST in data sets with high heterozygosity
|Species||mtDNA||μsatDNA|| R ||References|
| Lower population structure in mtDNA relative to microsatellites * |
|Smelt Thaleichthys pacificus|| F ST = 0.023|| F ST = 0.045||0.51|| McLean & Taylor (2001) |
|Red grouse Lagopus lagopus||ΦST = 0.010|| R ST = 0.16||0.63|| Piertney et al. (2000) |
| Equivalent population structure in mtDNA and microsatellite loci |
|Yellow Tang Zebrasoma flavescens||ΦCT = 0.098|| F′CT = 0.116||0.84|| Eble et al. (2011) |
|Deepwater snapper Pristipomoides filamentosus||ΦST = 0.029|| F′ST = 0.029||1.00|| Gaither et al. (2011) |
|Caribou Rangifer tarandus|| F ST = 0.128|| F ST = 0.127||1.10|| Cronin et al. (2005) |
| Higher population differentiation in mtDNA relative to microsatellite loci† |
|Warbler Dendroica caerulescens|| F ST = 0.019|| F ST = 0.011||1.73|| Davis et al. (2006) |
|Alligator snapping turtle Macrochelys temminckii||ΦST = 0.98|| F′ST = 0.43||2.28|| Roman et al. (1999), Echelle et al. (2010)|
|Sea otter Enhydrus lutris|| F ST = 0.466|| F ST = 0.183||2.55|| Larson et al. (2002) |
|Lake whitefish Coregonus clupeaformis|| F ST = 0.496||θ = 0.161||3.08|| Lu et al. (2001) |
|Guanaco (llama) Lama guanicoe|| F ST = 0.459|| F ST = 0.104||4.41|| Sarno et al. (2001) |
| Much higher population differentiation in mtDNA relative to microsatellite loci‡ |
|Humpback whale Megaptera novaeangliae||ΦST = 0.277|| F ST = 0.043||6.44|| Baker et al. (1998) |
|Hammerhead shark Sphyrna lewini||ΦST = 0.519|| F ST = 0.035||14.80|| Daly-Engel et al. (2012) |
|Sperm whale Physeter macrocephalus|| G ST = 0.03|| G ST = 0.001||30.00|| Lyrholm et al. (1999) |
|Blacktip shark Carcharhinus limbatus||ΦST = 0.350|| F ST = 0.007||50.00|| Keeney et al. (2005) |
|Bechstein’s bat Myotis bechsteinii|| F ST = 0.809|| F ST = 0.015||53.90|| Kerth et al. (2002) |
|Spectacled eider Somateria fischeri|| F CT = 0.189||θ = 0.001||189.00|| Scribner et al. (2001) |
|Loggerhead turtle Caretta caretta||ΦST = 0.42|| F ST = 0.002||210.00|| Bowen et al. (2005) |
As an illustration, the guanaco (wild llama) listed in Table 1 is an interesting case of a population on the island of Tierra del Fuego, isolated from mainland South America by a water barrier 8000 years ago (Sarno et al. 2001). This is a rare case of populations diverging in a known timeframe without migration, which would mean that the equilibrium value should be R = 1.0. In contrast, the detected R = 4.41, indicate nonequilibrium conditions or other factors such as selection or strong drift influencing population divergence.
During population divergence with migration, simulations indicate that equilibrium values of FST for mtDNA are always higher than those for nuclear markers. Using a low but realistic migration rate of m = 0.005 (where m is the proportion of each population that receives migrants per generation), Larsson et al. (2009) calculate an equilibrium FST = 0.66 for mtDNA, and FST = 0.33 for nuclear loci. This yields R = 2; however, this ratio (and the disparity between FST values for the two classes of markers) rises towards R = 4 under scenarios of higher migration. The example here and the guanaco above underscore that straightforward theoretical expectations do not necessarily translate to the natural world, but do act as a touchstone for reasonable expectations and are guiding principles not binding regulations.
Sex-biased dispersal is an extreme form of divergence with migration, and this condition alters patterns of population subdivision and R ratios, as indicated by comparisons of uniparental and biparental markers (Karl et al. 1992; Bowen et al. 2005). Male dispersal predominates in many vertebrate groups, with higher divergence among populations recorded in mtDNA (Table 1). Female dispersal predominates in birds (Prugnolle & de Meeus 2002), and in at least one case yields higher FST in microsatellites than mtDNA (R < 1; Table 1). An interesting case of female-biased dispersal is recorded for the primate Homo sapiens, in which autosomal chromosomes, mtDNA and Y chromosomes yield estimates of genetic variance between continents of 8.8%, 12.5% and 52.7%, respectively (Seielstad et al. 1998). In the anadromous fish Thaleichthys pacificus from the northeast Pacific, the microsatellite value is FST = 0.045, while the corresponding mtDNA value is FST = 0.023 (R = 0.51 in Table 1; McLean & Taylor 2001). Clearly, FST values from either mtDNA or microsatellites can be higher, depending on a complex set of conditions. The haploid inheritance of mtDNA (and other organelles) confers higher FST values under most conditions, but both theoretical and empirical studies show that this is not invariably true.
(iii) Estimated population coalescences are real
MtDNA genealogies are commonly used to infer historical demographies with coalescence theory (Kingman 1982), implemented in sequence mismatch analysis (Rogers & Harpending 1992) and Bayesian skyline plots (BSP; Drummond & Rambaut 2007), among other methods (Hey & Nielsen 2004). These methods produce estimates of compound parameters that include effective population size and mutation rate. Estimates of mutation rate are needed to extract the population variables and to date population events. However, several sources of error, including sample size and estimates of mutation rate, can seriously compromise the accuracies of coalescence-based analyses to infer population histories.
To illustrate some of these errors, we use coalescence simulations of nonrecombining DNA sequences under a population history of recent population growth that is typical for marine species (Box 1). These simulations show variability in the gene genealogies within a population and times to most recent common ancestor (TMRCA) for two sample sizes (Figs 1a and 2a). TMRCAs among replicate genealogies varied by a factor of two, and shapes of the genealogies varied considerably among replicates, even for the same sample size. In practice, the distributions of mutations along branches can then be used to reconstruct a genealogy (Figs 1b and 2b). In addition to coalescent variability, an observed DNA gene genealogy reflects only one realization of many possible mutation histories. In our simulations, mutation trees largely captured deep partitions in the coalescent trees, but did not always resolve relationships in the upper (younger) part of the trees. The variability among realized DNA trees can also be seen in the contrasting shapes of Bayesian skyline plots (BSPs; Figs 1c and 2c) and mismatch distributions (Figs 1d and 2d). Remarkably, these results were generated with the same demographic and mutation models.
Figure 1. Coalescence genealogies (a), mutation trees (b), Bayesian skyline plots (c) and mismatch distributions (d) for three coalescence simulation with sample size n = 25 drawn from a population that experienced a ‘knife-edge’ growth in size Ne = 1 000 to 1 000 000 at 250 generations in the past (See supplemental information for details of simulations).
Download figure to PowerPoint
Figure 2. Coalescence trees (a), one realization of a mutation tree (b), Bayesian skyline plots (c) and (d) observed (closed circles) and expected (expanding population) mismatch distributions for three coalescence simulations with sample size n = 100. Demographic model and explanation of figures as in Fig. 1.
Download figure to PowerPoint
These simulations show how coalescent and mutational randomness conspire to produce a variety of mtDNA genealogies for the same population history (Rosenberg & Nordborg 2002). However, molecular ecologists do not always appreciate that a single molecular genealogy perhaps produced by months of field and laboratory work, represents only one of an infinite number of possible coalescent and mutational realizations. In the hands of most molecular ecologists, data sets producing contrasting BSPs and mismatch distributions generally prompt different interpretations. For example, small differences in shapes of BSPs were used to argue alternative hypothesis of population colonization and expansion (e.g. peopling of the Americas: Kitchen et al. 2008; Fagundes et al. 2008). When samples are difficult to collect or to sequence, we often attempt to maximize our efforts by resorting to batteries of statistical tests. The pitfall of this approach, however, is the temptation to over-interpret results.
Another source of error is inaccurate estimates of mutation rate (μ) to calibrate a molecular clock. In marine studies, the closure of the Panama Seaway in the late Pliocene (Marko 2002; Coates et al. 2005) and the opening of Bering Strait in the early Pliocene (Verhoeven et al. 2011) are commonly used to calibrate μ. When an internal calibration is unavailable, researchers use a proxy calibration based on other taxa, or a ‘universal’ molecular clock rate (e.g. Bowen & Grant 1997). These phylogenetically derived mutations rates, however, appear to overestimate the ages of phylogeographical events inscribed in genetic data, sometimes by an order of magnitude (Ho et al. 2005, 2008; Crandall et al. 2012). As a result, BSPs and mismatch analyses in many studies appear to indicate population expansions during glacial maxima (Canino et al. 2010; Liu et al. 2010, 2011; Stamatis et al. 2004; Strasser & Barber 2009; Pérez-Losada et al. 2007; Marko & Moran 2009; Carr & Marsall 2008; Hoarau et al. 2007; among many others). These scenarios are unlikely, because marine populations contract and expand in response to decadal environmental shifts (Perry et al. 2005) and larger environmental disturbances are expected to have correspondingly larger effects on population abundances and distributions.
One possible explanation for inaccurate molecular clocks is that mutation rates may be ‘time dependent’ (Ho et al. 2005). Calibrations based on recent divergences between taxa show much larger mutation rates than calibrations based on ancient phylogenetic divergences for birds (Ho et al. 2005), primates (Ho et al. 2005, but see (Emerson 2007) and marine invertebrates (Crandall et al. 2012). The apparent elevation in mutation rate in recently diverged populations may be due to several factors, without having to invoke changes in the instantaneous rate of mutation. One source of error stems from the failure to account for polymorphisms in an ancestral population before it split into isolated populations destined to become new species (Hickerson et al. 2003; Charlesworth 2010). This effect is magnified in large populations, such as those in many marine species, and with the use of recent separation times to calibrate the molecular clock. Background selection on slightly deleterious alleles (Ho & Larson 2006; but see Peterson & Masel 2009) and balancing selection (Charlesworth 2010) may also contribute to apparent elevated mutation rates in recent divergences.
In many cases, the incorrect dating of phylogeographic events may be an artefact of a particular analytical method (e.g. mismatch analysis or BSPs) that does not distinguish between different histories of gene lineages in a sequence data set. For example, mtDNA data sets often consist of shallow, star-shaped lineages connected by deeper separations. When the star-shaped lineages are examined individually, the use of ‘standard’ phylogenetically derived estimates of mutation rate yields reasonable temporal estimates of recent population events (e.g. Saillard et al. 2000). Appropriate ‘apparent’ mutation rates for some methods of analysis can be estimated empirically with the analytical method itself. For example, Crandall et al. (2012) used BSPs to estimate population expansion dates in three marine species inhabiting the Sunda Shelf by reasoning that an expansion could only have occurred after the last glacial maximum (LGM), when rising sea levels submerged the shelf. Alternatively, Grant & Cheng (in press) simulated mtDNA sequences under a demographic model constructed from Pleistocene temperatures (Jouzel et al. 2007) to date the expansion of red king crab populations in the North Pacific (Fig. 3).
Figure 3. Bayesian skyline plots (BSPs) based on mitochondrial cytochrome oxidase I sequences (bp = 665) in red king crabs (n = 551) in the central North Pacific and Bering Sea. Historical apparent effective population size (thick line) is bracketed by the 95% highest probability density (grey). The BSP was constructed with BEAST 1.6 (Drummond & Rambaut 2007) under the TrN (Tamura & Nei 1993) model of nucleotide substitution, ten piecewise linear intervals and a strict molecular clock. A MCMC run of 400 million steps yielded an effective sample sizes (ESS) of at least 200.
Download figure to PowerPoint
In addition to providing an empirical mutation rate, our simulations demonstrate several features of coalescence analysis that can lead to erroneous inferences (Fig. 4). First, a putative stable population history preceding a recent population expansion (as reported in many cases) may be an artefact of coalescence analysis. Second, only the most recent episode of rapid population growth can be detected, even if the populations experienced several periods of growth and decline. Population declines during the LGM may not be severe enough to lower genetic diversities, but are sufficient to erase information about previous population swings. This loss of information results in a flat population curve that is often erroneously interpreted as population stability over much of the Pleistocene. Third, a spike in population size is associated with warming after the last glacial maximum 18 000–20 000 years ago. However, the use of the wrong mutation rate (Ho et al. 2011) or inattention to ancestral polymorphisms (Hickerson et al. 2003) can place this almost universal signal of population growth in a previous interglacial period or even at a glacial maximum. Molecular ecologists often test phylogeographic models with standard computer programs and with standard estimates of mutation rate without appreciating the pitfalls of coalescence-based analyses. Though coalescence-based analyses are valuable and informative, their estimation and interpretation need to be very carefully considered.
Figure 4. Ten replicate simulations (bold lines) of historical demography in red king crab to illustrate the extent that coalescence analysis of mtDNA sequences captures population size histories over the last several ice-age cycles. Grey lines enclose 95% highest probability densities around estimates of historical demography.
Download figure to PowerPoint