Patterns of genetic variation can provide valuable insights for deciphering the relative roles of different evolutionary processes in species differentiation. However, population-genetic models for studying divergence in geographically structured species are generally lacking. Since these are the biogeographic settings where genetic drift is expected to predominate, not only are population-genetic tests of hypotheses in geographically structured species constrained, but generalizations about the evolutionary processes that promote species divergence may also be potentially biased. Here we estimate a population-divergence model in montane grasshoppers from the sky islands of the Rocky Mountains. Because this region was directly impacted by Pleistocene glaciation, both the displacement into glacial refugia and recolonization of montane habitats may contribute to differentiation. Building on the tradition of using information from the genealogical relationships of alleles to infer the geography of divergence, here the additional consideration of the process of gene-lineage sorting is used to obtain a quantitative estimate of population relationships and historical associations (i.e., a population tree) from the gene trees of five anonymous nuclear loci and one mitochondrial locus in the broadly distributed species Melanoplus oregonensis. Three different approaches are used to estimate a model of population divergence; this comparison allows us to evaluate specific methodological assumptions that influence the estimated history of divergence. A model of population divergence was identified that significantly fits the data better compared to the other approaches, based on per-site likelihood scores of the multiple loci, and that provides clues about how divergence proceeded in M. oregonensis during the dynamic Pleistocene. Unlike the approaches that either considered only the most recent coalescence (i.e., information from a single individual per population) or did not consider the pattern of coalescence in the gene genealogies, the population-divergence model that best fits the data was estimated by considering the pattern of gene lineage coalescence across multiple individuals, as well as loci. These results indicate that sampling of multiple individuals per population is critical to obtaining an accurate estimate of the history of divergence so that the signal of common ancestry can be separated from the confounding influence of gene flow—even though estimates suggest that gene flow is not a predominant factor structuring patterns of genetic variation across these sky island populations. They also suggest that the gene genealogies contain information about population relationships, despite the lack of complete sorting of gene lineages. What emerges from the analyses is a model of population divergence that incorporates both contemporary distributions and historical associations, and shows a latitudinal and regional structuring of populations reminiscent of population displacements into multiple glacial refugia. Because the population-divergence model itself is built upon the specific events shaping the history of M. oregonensis, it provides a framework for estimating additional population-genetic parameters relevant to understanding the processes governing differentiation in geographically structured species and avoids the problems of relying on overly simplified and inaccurate divergence models. The utility of these approaches, as well as the caveats and future improvements, for estimating population relationships and historical associations relevant to genetic analyses of geographically structured species are discussed.

The geographic context of divergence is critical to understanding the evolutionary processes driving patterns of species diversity. Certain population structures will augment the importance of genetic drift (Wilkins and Wakeley 2002; Cherry and Wakeley 2003; Whitlock 2003) and selectively driven species divergence (e.g., differentiation associated with environmental heterogeneity: Schluter 2000; Zangerl and Berenbaum 2003; Forde et al. 2004). Yet, population genetic approaches for studying geographically structured species remain little developed, contrasting starkly with those for studying panmictic species (Drummond et al. 2002; Rokyta et al. 2004). There are an abundance of methods for detecting whether populations exhibit structure (e.g., Wright 1965; Holsinger and Wallace 2004; Latch et al. 2006), as well as some methods for determining how many distinct genetic clusters (i.e., populations) are present (e.g., Pritchard et al. 2000; Corander et al. 2003). Nevertheless, the majority of population-genetic models used to estimate genetic parameters relevant to studying species divergence (e.g., Kuhner et al. 1998; Edwards and Beerli 2000; Nielsen and Wakeley 2001; Hey and Nielsen 2004; Hamilton et al. 2005) focus on divergence between a pair of populations (or species), and do not explicitly consider the geography of divergence (for an exception see Beerli 2002), or that the relationships among populations may be hierarchical.

Inaccurate estimates of population-genetic parameters (Wakeley 2000, 2004; Wakeley and Aliacar 2001), or failure to identify important differentiation among groups (e.g., Long and Kittles 2003), may result when genetic data are analyzed using models that do not take into account that some populations are more closely related than other populations. For example, consider the schematic of montane grasshopper populations isolated on different mountain ranges (Fig. 1). Historical population associations are reflected as population-clades with their deeper nodes depicting older, regional divergence, whereas terminal branches in the population tree represent more recent and local divergence. The geography of divergence motivates a variety of hypotheses, ranging from historical demographic explanations to the purported role of selection in generating patterns of differentiation, and thereby necessitates a biologically realistic model that accommodates population structure. Moreover, estimates of genetic parameters pertinent to the divergence process, such as the timing of divergence and ancestral effective population sizes (e.g., Wall et al. 2002; Berthier et al. 2002; Felsenstein 2006), not only require a model of population divergence, but also depend upon the particular structure of the population-divergence model (Hudson 1990; Rosenberg and Nordborg 2002; Arbogast et al. 2002; Hudson and Turelli 2003; Hey and Machado 2003). For example, failure to recognize historical-population associations or that some populations share a more recent common ancestor (Fig. 1) may bias how patterns of shared variation are translated into genetic-parameter estimates because these estimates are derived from coalescent theory, where such expectations would have been generated under an inappropriate coalescent model of population divergence (Knowles 2004; Hey 2005).

Figure 1.

Schematic of a hypothetical sample of montane grasshopper populations that have recently diverged among the geographically disjunct mountain ranges. A reconstructed gene tree of a single locus (A) shows how widespread incomplete lineage sorting may obscure the actual population history (B). To test hypotheses relevant to the processes underlying species divergence (e.g., the timing of divergence, t1 and t2, and the size of descendant populations relative to ancestral ones, γ1, γ2, γ3, and γ4, compared to γA2aa, γA2b, and γA1), or to identify important differentiation among groups, requires a population-divergence model that accommodates multiple populations and their historical associations.

Models of population divergence have been key in the application of genetic data to address fundamental questions about the differentiation of panmictic species (e.g., Wu 2001; Nielsen and Wakeley 2001; Hey and Nielsen 2004). Similarly, models that accommodate multiple populations (Beerli 2002) and their historical associations (Rannala and Yang 2003) are critical to studying divergence in geographically structured species. The relationships among populations, or the population tree (Maddison 1997), may be of central interest itself, as with testing hypotheses about the geography of divergence where different population trees (topologies) represent alternative historical scenarios of differentiation within species (e.g., Milot et al. 2000; Knowles and Maddison 2002; Carstens et al. 2005a; Hickerson and Cunningham 2005; Buckley et al. 2006; DeChaine and Martin 2006), or the focus may be on estimating genetic parameters, where the estimates depend upon the specifics of the population tree (Rannala and Yang 2003). In both cases, the topology of the population-divergence model (i.e., population relationships and historical associations) is a parameter with far-reaching consequence on population-genetic analyses of the divergence process. The difficulty is that with highly subdivided species, it is not clear what is the appropriate model of population divergence.

Here we investigate divergence in a geographically sub-divided species, and specifically, confront the challenge of estimating a model of population divergence that incorporates information on historical-population associations in montane grasshoppers—a group for which geography has played an important role during their Pleistocene radiation (Knowles and Otte 2000; Knowles 2000, 2001a). The species of Melanoplus that inhabit the sky islands of the northern Rocky Mountains are currently isolated in montane meadows of different mountain ranges (Fig. 2). In addition to this contemporary subdivision, there is also geographic structuring of genetic variation reflecting historical population associations when sky island populations were displaced into glacial refugia. For example, in the broadly distributed, flightless species M. oregonensis (Knowles 2001b; Knowles and Richards 2005) the finding that sky island populations were recolonized from multiple ancestral source populations indicates that some populations are more closely related to each other than other populations. If some populations share a more recent common ancestor than others, this violates assumptions of equal and independent divergence between all pairs of populations, and expectations for gene identity (Wright 1969; Weir and Cockerham 1984; Nei 1987).

Figure 2.

Distribution of genetic variation in each of the sampled M. oregonensis populations from the Rocky Mountain sky islands (above 2000 m) from western Montana, Wyoming, and Idaho and M. triangularis; sampled populations are shown in black and numbered according to the list at the bottom of the figure. Each population has six pie graphs, representing the six loci used in this study, with COI at the 12:00 position, followed by anonymous loci 2, 6, 73, 102, and 211 in a clockwise manner. Allele frequency for each locus is represented as follows: shared alleles were ranked in order of their overall frequency across populations, and color coded according to the scale at the base of the figure, where red represents the allele with the greatest overall frequency; alleles unique to a single population are light blue and missing data are indicated by an empty circle.

The contemporary distribution of M. oregonensis sky island populations might be used to infer an a priori model of historical population associations. However, with the repeated distributional shifts during the Pleistocene in biogeographic settings like the northern Rocky Mountains (Pielou 1991), such inferences are likely to be inaccurate (Losos and Glor 2003), particularly in the absence of fossil data or paleoclimatic reconstructions of past population distributions (e.g., Hugall et al. 2002; Graham et al. 2004). In fact, geographically proximate populations have not necessarily been historical associated with each other, based on patterns of shared mitochondrial haplotypes (Knowles 2001b). Relationships among populations, and hence a model of population divergence, might be inferred directly from the gene trees reconstructed from DNA sequences. However, for recent divergence, such as those characterizing most phylogeographic studies (Avise 1994), historical associations among populations may not be obvious because of discordant gene trees and incomplete lineage sorting (Rosenberg 2002; Hudson and Turelli 2003). This limits any inference based on the gene trees to a qualitative guess, which may or may not accurately capture the biogeographic and temporal features of a species' history (Knowles and Maddison 2002). For example, withstanding the potential mismatch between a mitochondrial gene tree and the population tree, a model with three ancestral source populations was used to test the hypothesis that displacements into glacial refugia, as well as recolonization of previously glaciated areas, contributed to divergence in M. oregonensis (Knowles 2001a). The model provided a statistical framework for addressing the role genetic drift has played in divergence as species' distributions shifted in response to the Pleistocene glacial cycles. Nonetheless, the choice of a three-refuge model was based on the nonquantitative, visual inspection of a single gene tree.

Here we build upon this work, with two significant developments: (1) a quantitative estimate of the history of population divergence is made from (2) gene trees estimated for multiple, independent loci. We apply the method of minimizing the number of deep coalescences (Maddison and Knowles 2006), which considers explicitly both the processes of nucleotide substitution and sorting of gene lineages, to infer a model of population divergence that incorporates historical population associations for the montane grasshopper species M. oregonensis. This method is similar to other approaches (e.g., Edwards and Cavalli-Sforza 1964; Page and Charleston 1997; Nielsen 1998; Nielsen et al. 1998; Liu and Pearl 2006) in that sorting of gene lineages within populations is considered, but unlike these approaches (e.g., those that integrate over all possible gene trees), the information contained in the genealogical relationships among alleles (the topology of the estimated gene trees) are also explicitly considered (Takahata 1989; Rosenberg 2002; Degnan and Salter 2005). Population trees are also estimated from several alternative approaches and these estimates are compared to identify common features that are robust to the differing assumptions of the inference procedures (Kim 1993; Miyamoto and Fitch 1995). The biological implications of the inferred population-divergence model for understanding how differentiation proceeded during the dynamic Pleistocene, along with the difficulties of estimating a history of population divergence (as opposed to species relationships) are discussed.

The endeavor of inferring a model of population divergence that incorporates population relationships and historical associations is challenging, and caveats such as degraded accuracy with gene flow highlight areas for future theoretical development. Until the methodological constraints imposed by the lack of appropriate models for studying divergence under certain geographic conditions (e.g., highly fragmented and subdivided populations) are overcome, the evolutionary processes that predominate such species histories (Slatkin 1985) will necessarily be underrepresented in population genetic studies of species divergence. Consequently, generalizations about the primary factors contributing to species divergence, such as the relative roles of selection and genetic drift in speciation (see Coyne and Orr 2004), may be seriously biased. This study illustrates several approaches for quantitatively estimating a population-divergence model—the framework required for testing hypotheses and estimating parameters relevant to statistical phylogeographic study (Hudson 1990; Rosenberg and Nordborg 2002; Arbogast et al. 2002; Hey and Machado 2003; Knowles 2004). The development of methods that extract information from DNA sequences by considering the stochasticity of both the process of nucleotide substitution and gene lineage coalescence is an area of largely unexplored potential.

Materials and Methods


Specimens were collected throughout the range of M. oregonensis from 14 sky islands (see Appendix) in western Montana and northwestern Wyoming (Fig. 2), and from a closely related species, M. triangularis (Acrididae: Melanoplinae: Indigens species group). Multiple individuals in each of the populations were sequenced, following the recommendations about sampling design for estimating population relationships with incomplete gene lineage sorting (Maddison and Knowles 2006; see also Takahata 1989), as well as multiple loci for obtaining independent realizations of the process of allele coalescence (Felsenstein 2006). Five anonymous nuclear loci and one mitochondrial gene, cytochrome oxidase I (COI), were sequenced in 81 individuals. The average length of the anonymous nuclear loci was 979 bp, and when combined with the COI data, the total length of sequence generated per individual was over 6kb.

A genomic library was constructed to identify variable nuclear loci in Melanoplus (detailed protocol in Carstens and Knowles 2006). Total genomic DNA was extracted from one M. oregonensis using Qiagen DNeasy kits (Valencia, CA). The DNA was cut with HindIII, cloned with the Qiagen PCRplus Cloning kit (Valencia, CA), and sequenced using an ABI 3730 Automated Sequencer at the University of Michigan DNA Sequencing Core. Melanoplus-specific PCR primers were designed using Primer3 1.0 (Rosen and Skaletsky 2000) and Oligo 4.0 (Molecular Biology Insights, Inc., Cascade, CO). The PCR subcloning was used to verify that the loci were single copy; in other samples the phase was determined with the program Phase 2.0 (Stephens and Donnelly 2003). Variable loci were identified with an interspecific screening set that included a single representative from M. montanus, M. oregonensis, and M. marshalli. Five loci were selected and sequenced (Table 1), along with 1147 bp of COI (see methods in Knowles 2000). The choice of loci was not based on levels of variability within M. oregonensis (which would introduce an ascertainment bias because the lower bound for allele frequencies would depend on the number of individuals used to detect variable loci; Wakeley et al. 2001). The distribution of pairwise differences among individuals for each gene showed that the interspecific screening set did not affect the distribution of polymorphism in M. oregonensis (i.e., the distribution was not truncated because of ascertainment bias).

Table 1.  Description of genetic variation. Shown, from left to right, are the length of each locus, the number of segregating sites (s), the number of haplotypes, and the proportion of sites that are variable. Waterson's theta (γw) and nucleotide diversity (π) are also shown, both averaged within populations and the total across population.
LocusLengthsNo. of haplotypesProportion variableAverage within population1Total across populations2
  1. 1Averages calculated from estimates of γw and π estimated in each population separately.

  2. 2Average estimates of γw and π calculated across populations (i.e., species wide estimates).

Locus 295657470.0600.007460.008990.026670.01610
Locus 6100532320.0320.002610.003010.008180.00685
Locus 7385322320.0260.001340.001610.003940.00229
Locus 10289592550.1030.003870.004490.014070.00840
Locus 2111188123390.1040.005010.005430.018380.01313

Estimates of genetic diversity confirm that all loci exhibit variation relevant to genealogical analysis (Table 1). Summary statistics were estimated using the program SITES (Hey and Wakeley 1997). Theta (γ= 4Neμ) was estimated with Migrate-n (Beerli 2002) for each locus separately, and for all the data combined.


Genealogies for each locus were estimated using Paup* (Swofford 2002). Maximum likelihood, with models of evolution selected with DT-ModSel (Minin et al. 2003), was used as an optimality criterion for data sets comprised of unique alleles. Maximum parsimony was used to estimate genealogies for all alleles (e.g., including redundant alleles). Significant structuring of genetic variation within M. oregonensis was confirmed by an analysis of molecular variance (Excoffier 2000) on the combined loci (Arlequin 2.0, Schneider et al. 2000). Significance of the variance components was determined with 1000 permutations.

To evaluate the potential contribution of gene flow to geographic patterns of genetic variation, an isolation-with-migration model was used to estimate gene flow among all pairwise population comparisons using the program IM (Hey and Nielsen 2004). Because this model assumes that there is no intralocus recombination (Hey and Nielsen 2004), estimates of the per-site recombination rate were calculated for each locus using SITES (Hey and Wakeley 1997), and compared to values obtained from data simulated with no recombination under the estimated model of sequence evolution for each locus. The results from this test indicated that the assumption of nonrecombining loci was justified. The priors for the isolation-with-migration model parameters were truncated as follows: the effective population size of each population γ12= 10; the ancestral effective population size, γa= 30; reciprocal migration between populations, m12= m21= 5; and the divergence time, T= 10; priors were chosen such that the posterior probability distribution of parameter estimates was contained within the parameter space. Since the data include five anonymous nuclear loci with unknown mutation rates, the geometric mean of the ratios γi: γCOI for the four loci, 1.91 × 10−5, was used to calculate mutation rate vectors for scaling parameter estimates. The parameter space was searched using a linear heating scheme and seven metropolis-coupled Markov chains of 2.0 × 106 generations each. Given the cumbersome matrix of 105 pairwise-population comparisons, a single population wide analysis was also conducted using Migrate-n (Beerli 2002) and is presented. Unfortunately, interpretation of gene-flow estimates from this analysis is also problematic given that the model does not take into account hierarchical geographic structure (as apparent in M. oregonensis; see results), and the affect on the estimate gene flow rates are unknown.


Three different approaches were used to estimate a population-divergence model that incorporates population relationships and historical associations (i.e., a population tree). Only a nonquantitative guess of the population tree is possible based on visual inspection of the gene trees (Fig. 3). The geographic isolation of this flightless grasshopper species among the montane meadows of the sky islands suggests that the polyphyletic genealogies reflect the retention of ancestral polymorphism. The impact of this assumption is discussed below with regard to the differing sensitivities of the methods to its violation, especially because gene flow estimates are difficult to interpret as they do vary among populations (and methods of analysis), although many do not tend to be very high (see Supplemental Table 1).

Figure 3.

Figure 3.

Gene genealogies for each locus with branch lengths drawn to the same scale in all trees; (A) genealogical estimate for the mitochondrial COI data and a map of M. oregonensis populations; constituent haplotypes from the various populations are color coded according to the key at left, and (B) genealogies for the anonymous nuclear loci.

Figure 3.

Figure 3.

Gene genealogies for each locus with branch lengths drawn to the same scale in all trees; (A) genealogical estimate for the mitochondrial COI data and a map of M. oregonensis populations; constituent haplotypes from the various populations are color coded according to the key at left, and (B) genealogies for the anonymous nuclear loci.

Minimize deep coalescences method

A population tree was estimated using the approach based on minimizing the number of deep coalescence (i.e., the discord between gene genealogies and a population tree; Maddison 1997). This approach, like previous frequency-based approaches (e.g., Edwards and Cavalli-Sforza 1964), takes into account the genetic process generating incomplete lineage sorting (i.e., the retention and stochastic sorting of ancestral polymorphism), although the actual probabilities are not quantified under a stochastic model. Genealogical information from the reconstructed gene trees for each locus is also incorporated; historical information regarding population relationships is contained in patterns of gene lineage coalescence, even without the full sorting of gene lineages within populations (e.g., Degnan and Salter 2005; Maddison and Knowles 2006).

First, gene trees were inferred for each locus separately. Gene genealogies were estimated for all sampled alleles by a parsimony search using Paup* (Swofford 2002); a heuristic search with 10 random addition sequence replicates, and maxtrees=100 was used. The gene trees were then used to reconstruct a population tree that minimized the implied number of deep coalescence in the contained gene trees (Maddison and Knowles 2006). The tree search facility in Mesquite (Maddison and Maddison 2004) was used to find a population tree minimizing the total number of deep coalescences summed over the loci considered. The number of deep coalescence was counted assuming the estimated gene trees for each locus were unrooted. The search used an As Is taxon addition sequence, followed by subtree pruning regrafting branch swapping, saving 100 trees (maxtrees=100).

Shallowest divergence clustering method

This approach follows from Takahata's (1989) observation of a high consistency probability between a gene tree and population tree based on the order of inter-population coalescence. Population relationships were estimated using Mesquite's cluster analysis facility that grouped populations together based on their most similar pair of gene sequences (not their average pairwise sequence divergence; see also Edwards 1997), under the assumption that there is a correspondence between the number of nucleotide differences between sequences and the order of inter-population coalescence (Takahata and Nei 1985). The distance between two clades is similarly defined, and for multiple loci, the distance between two clusters is the average of the distances based on the individual loci (for details see Maddison and Knowles 2006).

Minimum average genetic distance method

The population tree was inferred from a matrix of patristic distances among populations generated with the minimum evolution criterion (Rzhetsky and Nei 1992) in PAUP* (Swofford 2002). A heuristic search with MAXTREES = 1000 was used with an As Is taxon addition sequence for the initial tree followed by TBR branch swapping. For each locus, corrected genetic distances were calculated for all pairwise comparisons among haplotypes using models selected with DT-ModSel (Minin et al. 2003). Average pairwise distances were computed among all populations for each locus, and the average of these distances was used to infer the population tree using the minimum evolution criterion (Rzhetsky and Nei 1992).


An analysis of molecular variance shows that differentiation among populations, as well as among regions, explains a significant proportion of the genetic variation (Table 2), indicating that there is significant geographic structuring of genetic variation. However, this structuring of variation is not readily apparent from a visual inspection of the individual gene trees (Fig. 3). The genealogical history of a single locus is subject to many stochastic effects, which is why data from multiple independent loci are important to offer independent information for estimating a population (or species) tree (Maddison 1997), assuming that genealogical relationships among alleles are discernable. Each of the six loci exhibits considerable variation (Table 1) and genealogical structure is apparent in all gene tress estimated for all the loci (Fig. 3); average divergence within and between populations was 0.52% and 1.23%, respectively (not shown).

Table 2.  Analysis of molecular variance (AMOVA) of the multilocus dataset; the partitioning of genetic variance among groups is based on the hierarchical population model inferred by minimizing the number of deep coalescence (see Fig. 5), in which the northern sky islands (shown in shades of blue) are compared against the more southern populations.
Source of variationdfSum of squaresVariance componentsF-statisticsTotal (%)P-value
Among groups  214448.472.9FCT=0.1111.49<0.06
Among populations within groups 1242630.7305.0FSC=0.5448.04<0.0001
Within populations14737776.9256.9FST=0.5940.47<0.0001

The lack of population monophyly and discordance among gene genealogies (Fig. 3) is consistent with expectations based on the sorting of gene lineages within populations (Hudson and Turelli 2003). To reconstruct a population tree for recent divergence, we must consider the processes underlying the messy tangle of gene trees, as when estimating other parameters, such as the timing of species divergence (e.g., Edwards and Beerli 2000; Takahata and Satta 2002; Yang 2002; Rannala and Yang 2003; Wall 2003; Hey and Nielsen 2004). In addition to the stochastic sorting of gene lineages by genetic drift (Nordborg 2001; Wakeley 2003), gene flow might also contribute to the discord between a population tree and the estimated gene trees. However, shared haplotypes are not restricted to geographically proximate populations (Fig. 2), and pairwise migration estimates tend to be low (Supplementary Tables S1), although they do vary among populations, making it difficult to rule out the possibility that the gene trees reflect some low level of gene flow. The potential influence of migration on the estimated population relationships is expected to vary depending on the methods assumptions; therefore, the sensitivity of the methods and potential to make misleading inferences about the underlying history of population divergence differs (and are discussed in detail below).


There are some commonalities in the population trees estimated from the three different approaches; however, they are not congruent (Fig. 4). The population tree inferred by the shallowest divergence clustering differs substantially from the other two methods. The population trees inferred from minimizing the deep coalescence and the minimum average genetic distance are generally congruent in that populations within the northern and southern parts of the range tend to cluster together; however, the two methods differ in that the tree estimated by minimizing the deep coalescence (Fig. 4A) results in a latitudinal pattern in which the southern populations are basal to the more northern populations, whereas the method of minimizing the average genetic distance suggests the converse (Fig. 4C). The population relationships estimated by minimizing the number of deep coalescence (and to a lesser extent, the population tree estimated from minimizing the average genetic distance) are also generally congruent with the previously hypothesized model of population divergence (Fig. 5) based on a nonquantitative interpretation of the gene tree estimated for COI (Knowles 2001b), and to which patterns of genomic variation from an analysis of AFLPs were compared (Knowles and Richards 2005). There is a very close correspondence between the previous population assignments and the current population tree (Fig. 5) with regards to the common ancestry of populations from the northern (shown in blue) and southern (shown in green and pink) part of the M. oregonensis range, with the exception of the southerly population from the Absaroka Mountains, that was not grouped with other southern populations in past analyses (Knowles 2001b; Knowles and Richards 2005). This structuring of populations evident in the estimated population tree (Fig. 4A and 4C) is consistent with hypothesized regional historical associations reminiscent of population displacements into multiple glacial refugia; however, such structure is not obvious in the population tree estimated by the shallowest divergence clustering (Fig. 4B).

Figure 4.

Population models of the species history of M. oregonensis estimated by (A) minimizing the number of deep coalescence, (B) clustering based on the shallowest divergence, and (C) the minimum average genetic distance. The population relationships and historical associations depicted in the three population trees are derived by considering (to varying degrees) the stochasticity of genetic processes. They are not (and should not be confused with) gene trees, which are often used as a reflection of a species history.

Figure 5.

Projection of the model of population divergence for M. oregonensis onto the geographic landscape of the northern Rocky Mountains, where the close species M. triangularis is indicated in black; the tree legend (shown on the left) corresponds to population tree estimated by minimizing the number of deep coalescence. The dashed line identifies congruence with a previously hypothesized model of regional divergence (Knowles 2001b); sky island populations not included in that model are marked with an asterisk.


The three methods used to estimate a model of population relationships and historical associations in M. oregonensis differ in two fundamental aspects: (1) the degree to which they consider the information contained in DNA sequences, and (2) the extent to which the inference procedure is sensitive to assumptions about the processes underlying the geographic structuring of genetic variation. The impacts of these differences are expected to influence not only the ability to resolve the population tree, but also the accuracy of the estimated population relationships from each of the methods. Both may contribute to the inconsistencies between the models of population divergence estimated from the different methods (Fig. 4).

Both the shallowest divergence clustering and minimizing the number of deep coalescence takes into account the stochastic sorting of gene lineages by genetic drift, and incorporate information inherent in the genealogical relationships among alleles, when estimating the population tree. However, by utilizing only information from the first coalescence between populations (Takahata 1989), the shallowest divergence clustering method does not use information contained in the pattern of interspecific coalescence from the multiple gene copies sampled per population, whereas the method of minimizing the number of deep coalescence uses information contained across the entire gene genealogy (Maddison and Knowles 2006), albeit not in a full probabilistic framework (see Degnan and Salter 2005). These two methods contrast with minimizing the average genetic distance, in which information contained in the genealogical relationships among DNA sequences is not considered.

While the types of information extracted from the data may influence the estimated population tree, inaccurate population relationships can also result when processes other than the stochastic sorting of gene lineages (such as gene flow) contribute to the lack of concordance between the gene genealogies and the population boundaries. Gene flow may significantly degrade the accuracy of some inference methods, even when levels of gene flow are low enough that the species phylogeny can still be considered fundamentally a branching process (Maddison 1997), as opposed to a network (see Moret et al. 2004a; Nakhleh et al. 2005). Methods that rely on the most recent common ancestor between populations (Takahata 1989; Rosenberg 2002) are particularly sensitive to misinterpreting migration as evidence for population relationship since only one gene copy per population forms the basis for inferring population relationships. The accuracy of the shallowest divergence method depends critically on a correspondence between the number of nucleotide differences between sequences and the order of interspecific coalescence (Takahata and Nei 1985). Inspection of the population tree derived via the shallowest divergence method shows that geographic proximate populations, such as the Big Belt and Little Belt Mountains, are most closely related to each other (Fig. 4B), whereas the populations are not estimated to share a most recent common ancestor based on minimizing either the average genetic distance or the number of deep coalescence (Figs. 4A and 4C). The influence of rare migration events would be significantly lessened with an approach that explicitly considers information from multiple individuals (i.e., gene copies) per population, as with the minimizing the number of deep coalescence approach. However, if gene flow rather than common ancestry predominates the geographic distribution of haplotypes, this method is also expected to give spurious results; however, with such a mosaic structure it would be inappropriate to represent the model of population divergence as a bifurcating tree (Moret et al. 2004b; Nakhleh et al. 2005), irrespective of the inference procedure. Because the information content in the multiple DNA sequences is reduced to a single variable when estimating the historical relationships by minimizing the average genetic distance among populations, the confounding signal of migration and common ancestry cannot be distinguished. This contrasts with the method of minimizing the number of deep coalescence, where the information contained in the independent loci (see also Jennings and Edwards 2005), and the pattern of coalescence of each individual gene lineage can provide evidence of population relationships (Maddison and Knowles 2006; Carstens and Knowles 2007a).

These varying sensitivities obviously have consequences for the accuracy of the estimated population relationships—the population tree depends on the method used. To quantify whether the differences in the population trees are significant, we asked whether the three population trees differ with respect to the degree of concordance between the DNA sequences and the respective population trees (i.e., are the three models equally good explanations of the data; Goldman et al. 2000). The likelihood of the data was compared under the competing population trees using a Shimodaira-Hasegawa test (2001) where the persite −lnL scores were calculated under each population tree (Table 3). The population tree estimated by minimizing the number of deep coalescences had the highest likelihood given the data, and the decrease in the −lnL score under the population-divergence models estimated by the shallowest divergence clustering and the minimum average genetic distance is significant (Table 3). This suggests that the population tree estimated by minimizing the number of deep coalescence is a better explanation for the data.

Table 3.  Comparison of the fit of the data under the three population trees estimated by (a) minimizing the number of deep coalescence, (b) the shallowest divergence clustering, and (c) minimizing the average genetic distance, using a Shimodaira-Hasegawa (2001) test, showing that the population tree estimated by minimizing the number of deep coalescence fit the data best (i.e., had the highest likelihood; shown in bold). For this test, the per-site −lnL scores were calculated under the competing population models. The decrease in the fit of the data (based on a site-by-site −lnL score) under the two other population trees was significant (P < 0.05), as determined using RELL bootstrap resampling with 1000 replicates.
MethodlnL scoreDecrease in −lnLP
Minimizing the number of deep coalescence−16624.657 
Shallowest divergence clustering−16734.550109.8930.001
Minimizing the average genetic distance−16797.657173.0000.040


The quantitative estimate of a population-divergence model in M. oregonensis illustrates two fundamental shifts in how the processes underlying the geographic structuring of genetic variation might be studied: using multiple realizations of the past (i.e., independent loci) to estimate the history of species divergence, and incorporating the process of gene lineage sorting into the procedure for estimating population relationships (as opposed to inferring them directly from the gene tree topologies). The observed discordance among loci in the pattern of shared alleles across populations (Fig. 2), and the lack of obvious population relationships, or hierarchical patterns of divergence in the genealogies (Fig. 3), are no doubt emblematic of the challenges facing phylogeographic study of recently diverged populations (and species) (Avise 1994). Despite any intuitive appeal of inferring history directly through qualitative visualization of gene trees (Avise et al. 1987), it is not tenable when (and as expected) the same history leads to very different gene genealogies (Rosenberg and Nordborg 2002; Hey and Machado 2003; Knowles 2004).

As discussed below, although the methods used in this study are still in their infancy (and may be joined by new probabilistic procedures in the near future), this development has significant implications. The estimated population-divergence model (Fig. 5), not only provides a framework for studying divergence (Avise 2004) but the model itself is also built upon the specific events shaping the history of M. oregonensis, thereby avoiding the potential problems of using overly simplified and inaccurate population-divergence models.


Coalescent simulations (Takahata 1989; Rosenberg 2002; Maddison and Knowles 2006) suggest that both minimizing the number of deep coalescences and the shallowest divergence clustering approaches are able to recover the population tree at levels of variability and genealogical discordance similar to those in the empirical data. However, the population trees estimated with these approaches are incongruent (Fig. 4). Because the sampling design used here to infer the population relationships in M. oreognensis (i.e., multiple individuals for each of multiple loci) provides the highest consistency probability between the gene trees and population tree (see Fig. 5, Maddison and Knowles 2006), one explanation for the incongruent population trees may be that the M. oregonensis species history does not follow a strict isolation model. If this is the case, the results from the shallowest divergence clustering are particularly suspect given that this assumption is critical to maintaining a high consistency probability between a gene tree and population tree based on the number of nucleotide differences between sequences (Takahata and Nei 1985). By considering the entire spectrum of gene lineage coalescence (Maddison and Knowles 2006)(as opposed to just the first interpopulation coalescence; Takahata 1989), historical population associates are less likely to be obscured by the confounding influence of low levels of gene flow.

Even if population divergence proceeds with some gene flow, the information contained in the gene trees and the relationships among alleles still apparently provides signal relevant to estimating population relationships in M. oregonensis. The fit of the data to the population relationships estimated by minimizing the number of deep coalescence indicates this population tree provides a significantly better explanation for the observed nucleotide substitutions across loci (Table 3), compared to the estimated population trees from the other methods, including the average genetic distance method that does not take into account the process of gene lineage coalescence. Furthermore, the estimated levels of gene flow do not support a predominant role for migration (online supplementary material Table S1), and given the geographic isolation of sky island populations in M. oregonensis, gene flow is not expected to govern patterns of geographic variation (Fig. 2 and 3).


The topological complexity of the northern Rocky Mountains creates a geographic setting, which like traditional archipelago systems (e.g., Hollocher 1998; Losos et al. 1998; Gillespie 2002; Glor et al. 2005; Jordal et al. 2006), is expected to be conducive to species divergence (e.g., Abbott et al. 2000; Knowles 2000; Masta and Maddison 2002; Demboski and Sullivan 2003; Carstens 2005a; DeChaine and Martin 2006). However, in the case of taxa affected by shifting habitat distributions in response to the Pleistocene glacial cycles (e.g., Ritchie et al. 2001; Comes and Kadereit 2003; Ayoub and Riechert 2004; Carstens et al. 2004; Galbreath and Cook 2004; Schönswetter et al. 2004; Weir and Schluter 2004; Yeh et al. 2004; Hickerson and Cunningham 2005; Knowles and Richards 2005; Smith and Farrell 2005; Dolman and Moritz 2006; Weir 2006), both contemporary population distributions and historical associations among populations are essential components for studying species divergence. With the dynamic nature of sky island systems (i.e., isolated montane habitats), the geography of divergence may be characterized by an older and recent population structure that reflects the divergence associated with displacement into glacial refugia and recolonization of the montane habitats, respectively (Haffer 1969; Hewitt 1996, 2000).

The population relationships estimated for M. oregonensis (Fig. 5) provide a window into past distributional shifts, identifying which populations have shared a recent evolutionary history, but also which populations have remained relatively isolated during the past. This framework of hierarchical structure (i.e., divergence at different spatial or temporal scales) not only can be used to address a number of interesting questions itself but can also be coupled with other types of data to test a variety of hypotheses. For example, the model of population divergence can be used to estimate genetic parameters relevant to understanding how species were able to diversify during the dynamic Pleistocene, such as the relative contributions of drift-induced divergence associated with glacial versus interglacial periods to differentiation in M. oregonensis (Knowles and Richards 2005). Without a biologically realistic model that accommodates the hierarchical structure of the populations, conclusions regarding the partitioning of genetic variances are suspect (Long and Kittles 2003). Integration of this model of population divergence with information on climatic reconstructions (e.g., Hugall et al. 2002) or incorporation of geographic features (landscapes, barriers, organism specific distances) (e.g., Kidd and Ritchie 2000) might also be used to identify the likely location of refugia, as well as the factors that structured how populations moved with the advance and retreat of glaciers. Such a context will be important for future comparative analyses, where the response of individual species can be examined in a predictive framework such that the interaction between rapid climate change and species ecology can be examined (e.g., Carstens et al. 2005b; Hickerson and Cunnigham 2005; DeChaine and Martin 2006).


This study provides both a glimpse into the future promise, and some of the challenges for using sequence data from multiple loci to estimate a model of divergence, where historical population has been structured across the geographic landscape (Fig. 5). Just as accounting for the stochasticity of genetic processes in recently derived species has revolutionized how population-genetic parameters are estimated (e.g., Edwards and Beerli 2000; Takahata and Satta 2002; Yang 2002; Rannala and Yang 2003; Wall 2003; Hey and Nielsen 2004), accurate estimates of population relationships are possible when the process of gene lineage coalescence is considered (Takahata 1989; Nielsen et al. 1998; Rosenberg 2002; Maddison and Knowles 2006; Carstens and Knowles 2007b). However, the lack of the expected correspondence (Maddison and Knowles 2006) between the population trees estimated by minimizing the number of deep coalescences compared to the shallowest divergence clustering (Fig. 4) indicates the potential confounding influence of migration on estimated population relationships. This highlights the need for methods that incorporate not only the process of gene lineage sorting, but also gene flow, into the procedure for estimating population trees (as with methods used to obtain accurate estimates of species divergence times; e.g., Edwards and Beerli 2000; Hey and Nielsen 2004). In the future, estimation of population relationships in a full probabilistic (Maddison 1997), or possibly a Bayesian framework (Liu and Pearl 2006), will also be preferable to the summary statistic approach applied here and raises the intriguing possibility of finding the species tree with the highest posterior probability or a set of possible population trees consistent with gene trees (Degnan and Salter 2005).


The lack of a framework for statistical phylogeographic inference in geographically structured species not only constrains the types of hypotheses that can be addressed, but also introduces a bias in generalizations about evolutionary processes that predominant species divergence such as the relative importance of selection and genetic drift synthesized from empirical studies (see Coyne and Orr 2004). One of the primary obstacles in population genetic approaches to studying complex species histories has been the challenge of establishing a population-divergence model that incorporates population relationships and historical associations, as opposed to dividing the data into a series of analyses of population pairs (e.g., Hey 2005; Carstens and Knowles 2007a). Recent analyses demonstrate that despite widespread incomplete lineage sorting, signal of population relationships persists (Rosenberg 2002; Maddison and Knowles 2006). Application of these approaches to multilocus data in the montane grasshopper M. oregonensis illustrates how such a population-divergence model might be inferred. Although several caveats warrant caution in this endeavor, this study signifies an important shift in how geographically structured species can be studied—in this case, estimation of model of population divergence (e.g., Milot et al. 2000; Hickerson and Cunningham 2005; Buckley et al. 2006; DeChaine and Martin 2006) that incorporates the geographic structure associated with displacements into glacial refugia and recolonization of the sky islands of the northern Rocky Mountains. Historical signal is separated from stochastic noise to estimate population relationships (Fig. 5) by relying on multiple loci and taking into account the genetic process that result in genealogical discord. This model of population divergence represents a significant advance over the common reliance on a single realization of the past—that is, a literal interpretation of one gene tree—and provides a framework for testing hypotheses about differentiation across geographic complex landscapes (Wright 1931; Mayr 1963).

Associate Editor: P. Sunnucks


Thanks to members of the Knowles lab for their input, especially to M. Keat for assistance in the laboratory, as well as to Paul Sunnucks, Scott Edwards, and two anonymous reviewers for their helpful comments and suggestions. The research was funded by the following awards to LLK: a National Science Foundation grant (DEB-0447224), the Elizabeth Caroline Crosby Fund, National Science Foundation ADVANCE Project, University of Michigan, and a grant from the University of Michigan (Office of the Vice President for Research).


Table Appendix..  Number of individuals sequenced from each of the 14 sky island populations of M. oregonensis and M. trianglularis for the six loci.
Sky island populationsTotalCOILocus 2Locus 6Locus 73Locus 102Locus 211
M. oregonensis
  Absaroka Range, Carbon Co., MT5555555
  Madison Range, Madison Co., MT5555545
  Big Snowy Mtns., Fergus Co., MT8883784
  Gallatin Range, Teton Co., WY5542655
  Mission Range, Missoula Co., MT6653664
  Crazy Mtns., Sweet Grass Co., MT5555555
  Big Belt Mtns, Meagher Co., MT5534420
  Gravelly Range, Madison Co., MT5550555
  Livingston Range, Glacier Co., MT6656656
  Beaverhead Mtns., Fremont Co., ID5552555
  Tobacco Root Mtns., Madison Co., MT6660666
  Wind River Range, Teton Co., WY5554555
  Little Belt Mtns., Cascade Co., MT5545550
  Elkhorn Mtns., Jefferson Co., MT5550535
M. triangularis
  Swan Range, Flathead Co., MT5555555