(Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, and College of Life Sciences, Zhejiang University, Hangzhou 310058, China)
(Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Plant Sciences, and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou 310058, China)
Abstract Phylogeography has been one major focus of evolutionary biology in recent years, with many important advances in Chinese species. In this issue, we collected 11 phylogeographic studies of plants by Chinese laboratories. We further synthesized the main findings and patterns emerging from these and previous phylogeographic studies in China and asked where phylogeographic research should be directed in the coming years. Numerous examples have shown that phylogeographic patterns in China did not show an expected expansion–contraction pattern at large scale, mirroring the geological records showing that no unified ice sheet had developed in China during the Quaternary Period. Instead, regional expansions and intraspecific divergences are very common in most studied species during the Quaternary oscillations. Different intraspecific lineages or alleles (haplotypes) were detected in multiple localized refugia, from where regional or local expansions are likely to have started. Hybridizations and introgressions are frequent between intraspecific lineages or between different species. We also reviewed computational methods for phylogeographic analyses. Despite the great progress made in recent years, there remains much to discover about the spatial–temporal dimensions and underlying speciation mechanisms of Chinese plants. Phylogeographic studies represent a key knot that connects the genus phylogeny (macroevolution) and speciation and adaptation (microevolution). Therefore, we advocate that: (i) phylogeographic studies of plants in China should be directed to the closely related species or a monophyletic group (for example, a genus or a section) in the coming years; and (ii) population genetic data based on direct sequencing multiple loci, especially those from nuclear genome and statistical tests should be widely adopted and enforced. The recovered intraspecific divergences and phylogeographic patterns of multiple-species may allow us to better understand the high plant diversity in China and set up concrete hypotheses for studying plant speciation and diversification mechanisms in this region.
Phylogeography is defined as “a field of study concerned with principles and processes governing the geographic distributions of genealogical lineages, especially those within and among closely related species” (Avise, 2000). It obviously infers that this research field was set to unite macroevolutions (phylogenetics) and microevolutions (speciation) (Hewitt, 2000, 2004). In other words, phylogeography combines phylogenetic analyses with population data, and develops new approaches to address questions which span these two areas (Avise, 2009), i.e., responses of organisms to the past climatic changes (locations of glacial refugia and range expansion), population biology (bottlenecks, migration rates, population divergence) and evolutionary scenarios within and between species (hybridizations, introgressions, and divergence times). The term ‘refugium’ was first defined to describe the contracted ranges of plants during the last glacial maximum (LGM), and it is one of the most important aims of phylogeography. Now, this word is applied to numerous environmental contexts, including tropical forests (Haffer, 1969). However, in contrast to the retreats of these hot- or temperate-adapted taxa during the glacial phases (Frenzel, 1968), the cold-adapted taxa may be confined to smaller regions during inter- or post-glacial periods (Stewart et al., 2010). Therefore, the term ‘refugium’ is now widely applied to the region to where any species contracted its range during historical dynamics (Keppel et al., 2012). Through genetic analyses of the allele distributions within a particular species, scientists can identify where this species survived before its recent expansion (Comes & Kadereit, 1998; Brunsfeld et al., 2001; Soltis et al., 2006; Birks & Willis, 2008; Avise, 2009; Stewart et al., 2010). It should be noted that the glacial refugia is highly dependent on the adaptations and tolerances of the particular species being studied (Stewart et al., 2010). For example, refugia for temperate plants in Europe were identified in different regions in the southern peninsulas and the east (Weiss & Nuno, 2006). From these refugia, these species may have expanded and recolonized northward when the glacial ages ended during the Holocene. Increasing evidence also suggests some species survived in the cryptic northern refugia at higher latitudes during the glacial ages in both Europe and North America (e.g., Anderson et al., 2006; Stewart et al., 2010; Parducci et al., 2012). The other important aim of phylogeography is to clarify evolutionary history of one species or a group of closely related species (e.g., Hewitt, 2004; Hickerson et al., 2010). This is due to the fact that most species have experienced divergent or “contacting” evolution during glacial contractions and postglacial expansions (Avise, 2000). For example, during the glacial ages, large-scale extinctions may occur for one particular species and the long-term isolated refugia undoubtedly would have lead to divergent evolution within a single species because of the repeated bottlenecks (Soltis et al., 2006; Hickerson et al., 2010). This may also have enforced further divergence between closely related species, which may have smaller differentiations before (e.g., Avise, 2000; Stewart et al., 2010). In contrast, such climatic changes should have brought allopatric species or intraspecific lineages together into a single refugium or ‘a melting pot’ during glacial phases or postglacial expansions (e.g., Hickerson et al., 2010; Jia et al., 2012). These differentiated species or intraspecifc lineages and populations may have contacted and merged again, possibly resulting in hybridizations, introgressions and the origin of new hybrid lineages (Avise, 2000; Stewart & Stringer, 2012).
In plants, most phylogeographic studies have used chloroplast DNA (cpDNA) for several reasons (Avise, 2000). First, it is usually inherited maternally (in most angiosperms) or paternally (in some gymnosperms) (Comes & Kadereit, 1998; Soltis et al., 2006). The different cpDNA fragments from one single individual can be assumed to be a single “locus” without recombination. This cpDNA exists as a haploid, rather than diploid genome and the different alleles (or genotypes) from the different individuals are therefore called haplotypes (or sometimes chlorotypes). Second, because cpDNA is usually transmitted through a maternal pedigree, it has a smaller effective population size than nuclear DNA (Avise, 2000). When gene flow stops between isolated populations, the cpDNA diversity in each isolated population will coalesce to the ancestral haplotype (a process called lineage sorting) at a faster rate than nuclear loci. Genetic differentiation, lineage sorting and phylogeographic structure between geographic populations therefore become more obvious (Avise, 2009). Finally, most nuclear loci may evolve at a lower rate than cpDNA (Avise, 2000). This may lead to fewer genetic variations of the nuclear loci used for population genetic analyses and a lower rate of lineage sorting. However, recent studies based on the species-specific variation comparisons between ITS and cpDNA have suggested that interspecific lineage sorting at nuclear ITS may be much faster than that at cpDNA in some plant species (Wang et al., 2011a, 2011c). In fact, this was also recorded to be the case within the intraspecific studies (Wang et al., 2009b). Therefore, if ITS is found to be variable within a single species and/or between closely related species, it may be highly useful to sequence all populations during phylogeographic studies by cpDNA data. This may give more weight to inferences on genetic divergences and introgressions between and within species.
2 Phylogeographic patterns of plants in China
Recent progress on phylogeographic studies of plants in China have been outlined by Qiu et al., (2011). Based on a few selected publications and the collected 11 papers by Chinese researchers in this issue, we aim here to highlight major progresses relating to two major aims of the phylogeography: (i) to identify glacial refugia and postglacial range expansion of a certain species; and (ii) to clarify the evolutionary histories of the closely related species (Avise, 2000). For the first aim, we divided China into the following four major regions: Qinghai–Tibetan Plateau (QTP) and southwestern (SW) China, west China, northern and northeast China, and southern and southeast China. A species may occur in more than one region but we ascribe it to a single region based on its major distributions. The QTP and SW China, together as one of the world's biodiversity hotspots, have exceptionally diverse flora (Myers et al., 2000). In the eastern QTP and adjacent SW China, a series of spectacular north–south trending mountains alternate with deep valleys. Because of the highly variable altitudes in this region, numerous species occur there (Wu, 1987). As one important biodiversity cradle, it may harbor ancient species as well as the newly originated species (Wu, 1988). Ancient species may have retreated to this region as a refuge since the Miocene when the global climate began to cool (Wu & Wu, 1996). In addition, the QTP is the most sensitive to the climate changes and it is reasonable, therefore, to assume that more species may have retreated the eastern QTP and SW China during the more recent Quaternary climatic oscillations (Shi et al., 1998). However, during the interglacial phases or at the end of the LGM, some species may have recolonized the QTP platform and the adjacent regions at high-altitude. This possibility has been confirmed by several phylogeographic studies, for example, in Juniperus prezewalskii (Zhang et al., 2005), Picea crassifolia (Meng et al., 2007), Metagentiana striata (Chen et al., 2008b), Pedicularis longiflora (Yang et al., 2008) and other species (e.g., Yan et al., 2007; Wang et al., 2008, 2011b; Opgenoorth et al., 2009; Cun & Wang, 2010; Sun et al., 2010; Wu et al., 2010; Xu et al., 2010; Li et al., 2011a, 2012a; Yang et al., 2012; Zhang et al., 2012; Zou et al., 2012). However, it remains to be determined when these species recolonized at high-altitude, for example, at the end of the LGM or the largest glaciation occurring in the QTP. Some species may have also survived through the Quaternary at high-altitude (e.g., Wang et al., 2009b; Jia et al., 2011, 2012). Multiple refugia may have remained for these species, at least during the LGM (e.g., Wang et al., 2009b; Jia et al., 2011, 2012; Zhang et al., 2012). In fact, the multiple refugia of SW China remained for the studied species (e.g., Gao et al., 2007; Yuan et al., 2008; Wang & Guan, 2011). In western China, deserts seemed to have promoted the allopatric divergences of the studied species and these diverged populations should have survived in the different refugia during the LGM (e.g., Guo et al., 2010; Wang et al., 2011c; An et al., 2012; Li et al., 2012b, 2012d; Zhang & Zhang, 2012).
Evolutionary histories of one single species or a group of species in China have been conducted based on more than one set of molecular markers recently. Three important findings can be inferred from these studies. First, within some species with long origin history, deep lineages with respective geographical distributions were identified (e.g., Wang et al., 2009b; Jia et al., 2011, 2012). These lineages were dated to have occurred very early, usually before the LGM. Several studies highlighted the importance of the largest glaciation (between ca. 1.2 Ma and 0.17 Ma) to drive this intraspecific divergence (e.g., Qiu et al., 2009a; Wang et al., 2009b; Jia et al., 2012; Yang et al., 2012). In addition to these climatic oscillations, orogenic processes, for example, the uplift of mountains and formation of large rivers, also may have led to such intraspecific divergences or genetic divergences between closely related species (e.g., Wang et al., 2009b; Xu et al., 2010; Jia et al., 2011, 2012; see reviews by Qiu et al., 2011). These results are highly consistent with diversification patterns of a few genera with numerous endemic species occurring there (e.g., Liu et al., 2002, 2006). Second, hybridizations and introgressions between intraspecific lineages or between closely related species occurred more frequently than expected (e.g., Wang et al., 2009b; Zeng et al., 2011; Jia et al., 2012; Zou et al., 2012). This may have occurred when two lineages (closely related species) expanded and hybrid zones formed after the end of the glacial ages or during the interglacial (Li et al., 2010; Du et al., 2011; Stewart & Stringer, 2012). Such scenarios also occurred when the closely related lineages (belonging to one single species or different species in morphology) retreated into a single refugium (Stewart et al., 2010). They rarely happened as a result of the long-distance dispersals (Zou et al., 2012). In fact, introgressions were frequently detected at the hybrid zones of two diverging species with different preferences of the ecological niches (e.g., Du et al., 2011; Zeng et al., 2011). Finally, adaptive evolution was detected at a few nuclear loci when such genetic markers were examined (e.g., Li et al., 2011b). All these findings suggest that plant diversification and speciation mechanisms can be inferred through such phylogeographic analyses based on multiple loci from both cytoplasmic and nuclear DNA.
3 Computational methods for phylogeographic analyses
First, during construction of species trees based on multiple loci and multiple individuals, likelihood or Bayesian methods should be used when these loci have different inherited histories (Liu et al., 2009). These approaches can estimate divergent time and ancestral population size at each locus or multiple loci at the same time (Rannala & Yang, 2003). Another similar tool, i.e., *BEAST developed by Heled & Drummond (2010), estimated the topology and demographic parameters based on the algorithms proposed by Heled & Drummond (2008). Because the gene coalescent times always predate the species divergence times, minimum coalescent times can be used to reconstruct species trees roughly (Kubatko et al., 2009). Second, new tools have been developed and widely used to measure gene flow between diverging species (Nielsen & Wakeley, 2001; Pinho & Hey, 2010). For example, the isolation and migration (IM) model and the corresponding calculations can be extended to the multiple-loci data (Hey & Nielsen, 2004) and multiple populations (Hey & Nielsen, 2007; Hey, 2010; Choi & Hey, 2011). In contrast to the likelihood realization, Wegmann & Excoffier (2010) presented a novel attempt to infer demographic history included in the IM model parameters using the approximate Bayesian computation (ABC) procedure. This likelihood-free approach is more effective, although the models become also more complex (Beaumont, 2010). In addition, validations of ABC inferences should be further checked and tested through the statistic fitness distribution of multiple statistics or other methods (Wegmann et al., 2010) in order to obtain reliable results. Finally, spatiotemporal reconstructions and tests of the alternative hypotheses of dynamic histories of a species based on allele distributions can be realized through the newly developed approaches (e.g., Maddison & Maddison, 2008; Lemey et al., 2009, 2010). Especially, both discrete and continuous diffusions can be implemented in a Bayesian inference framework. In contrast to the parsimony methods, these methods integrate phylogenetic uncertainty and Markov model parameter uncertainty, and also the uncertainty in the dispersal process. The multivariate Brownian diffusion model and the relaxed random walk model were both implemented and compared in the BEAST software package (Drummond et al., 2012). A Bayes factor test based on the ‘Bayesian stochastic search variable selection’ was used to identify the parsimonious phylogeographic processes.
4 Directions in the future
Here we provide three basic guidelines for phylogeographic studies in the future (Fig. 2). First, it is better to choose one genus or a group of closely related species as a starting point. Such taxonomic units usually are monophyletic. Through DNA barcoding of the acknowledged species based on the commonly used DNA fragments and multiple accessions to delimitate the species, it is better to choose one ‘monophyletic’ species or one monophyletic group with more than one species for further phylogeographic studies, which will sample as many populations as possible covering its entire distribution range. Second, as we stated before, most previous phylogeographic studies have been based exclusively on cytoplasmic DNA. In the future, as many nuclear loci as possible (Heuertz et al., 2006), especially single-copy loci, should be adopted in such studies. Finally, as pointed out by Knowles (2009), hypothesis testing and parameter estimation should be incorporated into such phylogeographic studies (Nielsen & Beaumont, 2009; Bloomquist et al., 2010; Crisp et al., 2010; Qiu et al., 2011). These integrated approaches will not only solve most phylogeographic questions, but also provide a basic knowledge for a local flora. Particularly, they will set the basic framework for studying speciation between closely related species and utilizing useful genetic resources during their adaptive divergences.
This research was supported by grants from the National Natural Science Foundation of China(40972018) and the Ministry of Science and Technology of China (2010DFB63500).