Pleistocene climate change has had an important effect in shaping intraspecific genetic variation in many species; however, its role in driving speciation is less clear. We examined the possibility of a Pleistocene origin of the only two representatives of the genus Pugionium (Brassicaceae), Pugionium cornutum and Pugionium dolabratum, which occupy different desert habitats in northwest China.
We surveyed sequence variation for internal transcribed spacer (ITS), three chloroplast (cp) DNA fragments, and eight low-copy nuclear genes among individuals sampled from 11 populations of each species across their geographic ranges.
One ITS mutation distinguished the two species, whereas mutations in cpDNA and the eight low-copy nuclear gene sequences were not species-specific. Although interspecific divergence varied greatly among nuclear gene sequences, in each case divergence was estimated to have occurred within the Pleistocene when deserts expanded in northwest China.
Our findings point to the importance of Pleistocene climate change, in this case an increase in aridity, as a cause of speciation in Pugionium as a result of divergence in different habitats that formed in association with the expansion of deserts in China.
There is considerable evidence for Pleistocene climate change having played an important role in shaping geographical patterns of intraspecific genetic diversity (Hewitt, 1996, 2000, 2004). Less clear, however, is its importance in causing speciation (Bennett, 2004; Barnosky, 2005; Futuyma, 2010). During Pleistocene glaciations, species at high to mid-latitudes were affected by the spread of large ice sheets, while at lower latitudes they were subject to increased aridity and lower temperatures (Willis & Niklas, 2004). As a consequence, the geographical distributions of many species at all latitudes became fragmented, thus promoting conditions for allopatric divergence among isolated populations and possibly speciation. However, the duration of glacial periods may have been insufficient for speciation to occur, and stasis, rather than speciation, may have been the more likely outcome of repeated Pleistocene cycles of range fragmentation and reunification (Bennett, 2004). Indeed, for many organisms, morphological stasis throughout the Pleistocene is apparent from the fossil record (Bennett, 2004; Willis & Niklas, 2004). In addition, there is evidence that rates of extinction were greater than rates of speciation during the Pleistocene (Kadereit et al., 2004). Nevertheless, molecular studies have revealed a growing number of examples of Pleistocene speciation in diverse plant groups, including both herbs and trees (Comes & Abbott, 2001; Comes & Kadereit, 2003; Brochmann & Brysting, 2008; Martin-Bravo et al., 2010; Levsen et al., 2012), thus raising the question as to what conditions might have promoted speciation during the period. Clearly, more studies of Pleistocene speciation are required before a deeper understanding is obtained of how Pleistocene climate change may have promoted the origin of new species.
In the past decade, DNA sequences from multiple loci have been widely used to characterize divergence between closely related species (Sweigart & Willis, 2003; Zhou et al., 2007, 2008a,b; Niemiller et al., 2008; Nadachowska & Babik, 2009; Zheng & Ge, 2010; Strasburg & Rieseberg, 2011). The molecular-clock timescales based on such interspecific divergence may not be as precise as geological timescales, but are good enough for the generation of a temporal hierarchy (Hurka et al., 2012; Ikeda et al., 2012; Levsen et al., 2012). In the study reported here, we use a molecular approach to assess the potential influence of Pleistocene climate change on plant speciation in climate-sensitive deserts. Increased aridity throughout the duration of the Pleistocene may have accelerated desertification in different parts of the world, as was the case in central Asia (Höermann & Süssenberger, 1986; Yang, 2006), and this may have created new desert habitats triggering speciation in response. Here we present the first evidence of a Pleistocene origin of two desert plant species in central Asia. The two species concerned, Pugionium dolabratum and Pugionium cornutum (Brassicaceae), are the only two known species of Pugionium, and are distributed in the Mu Us and Kubuqi deserts of northwest China (Zhao, 1999). They differ greatly in their habitat, morphology and growth form (Illarionova, 1999; Yu et al., 2010; Supporting Information, Fig. S1) with P. dolabratum occurring on fixed or semifixed sandy land in desert steppe or the desert fringe, whereas P. cornutum is found on mobile dunes where its burial might be prevented by an ability to produce a long stem, that is > 1.5 m in length (Fig. S1, Yu et al., 2010). Leaf lobes are narrower in P. dolabratum and all mature individuals have short basal radial branches and virtually no stem. The silicles (fruits) of P. cornutum have narrow wings and an acuminate apex, while in P. dolabratum they have wider wings and an obtuse apex. Both species are outcrossing, but exhibit different flowering times with daily blooming peaks separated by 1–2 h (Huang et al., 2009). These two psammophytes have distinct morphological characters and prefer different habitats, providing excellent material for examining speciation caused by Pleistocene climatic change.
Despite their marked morphological and ecological differences, a previous preliminary study based on a sample of five individuals from each species indicated that both species were very similar for chloroplast (cp) DNA, internal transcribed spacer (ITS), and five low-copy nuclear gene sequences (Wang et al., 2011). Thus, no species-specific variation was resolved for cpDNA sequence, only a single ITS mutation distinguished the two species, and for only one of five low-copy nuclear sequences were individuals of each species grouped into distinct clades corresponding to the different species (Wang et al., 2011). This high degree of molecular similarity suggests that the two species diverged very recently, probably in response to the development of large moving dunes and extension of desert in northwest China (Yang, 2006). In the present study, we compared sequence variation at eight unlinked low-copy nuclear genes, for ITS, and for three cpDNA fragments, across a greater number of individuals and populations of both species. Our aim was to determine the amount of sequence divergence based on a more thorough sampling of each species, to quantify the changes of effective population sizes and the amount of gene flow during species formation between P. dolabratum and P. cornutum, and finally to establish by means of coalescence-based analyses whether divergence times between species estimated from different datasets were all placed firmly within the Pleistocene.
Material and Methods
Leaves were collected from two to four individuals from each of 11 P. cornutum (L.) Gaertn. and 11 P. dolabratum Maxim. Populations, covering almost the entire geographic range of both species (Fig. 1, Table S1, a total of 64 individuals for two species). The distributions of the two species overlap in the Mu Us and Kubuqi deserts of northwest China, but are not sympatric as a result of their occupation of different habitats (Illarionova, 1999; Zhao, 1999). In two parts of their range, the species occurred in relatively close proximity to each other. Thus populations four and five of P. cornutum occurred in close proximity to populations 13 and 15 of P. dolabratum, respectively (Fig. 1). During sampling, the total number of individuals recorded for P. dolabratum (> 300 individuals) was several times larger than that for P. cornutum (< 100 individuals). We also sampled one accession of Megarpaea delavayi to serve as outgroup because this genus is sister to Pugionium (Yu et al., 2010). Fully developed leaves collected for DNA extraction were rapidly dried and preserved in silica gel.
Total genomic DNA was extracted from c. 20 mg of leaf tissue using a modified 2 × cetyltrimethyl ammonium bromide (CTAB) extraction protocol (Doyle & Doyle, 1987). We followed our previous study (Wang et al., 2011) in sequencing three chloroplast (cp) DNA fragments (rbcL, trnH–psbA and matK) and the entire ITS region of nrDNA (i.e. ITS1 + 5.8S + ITS2) in all individuals sampled. In addition, eight low-copy nuclear genes (Det1, Cop1, Cip7, Chs, Rps2, Dpa1, Pgic1, MPS33) were sequenced in all individuals (Table S2). Five of these (Det1, Cop1, Chs, Rps2, Pgic1) were examined previously by Wang et al. (2011). The low-copy nuclear genes can be classified into three functional categories: light regulation-related (Det1, Cop1 and Cip7), defense-related (Chs, Rps2 and Dpa1), and other or unknown function (Pgic1 and MPS33). For sequencing of these genes, we employed primers designed by others for use in the closely related species, Arabidopsis thaliana, Boechera fecunda and Brassica oleracea (Caicedo et al., 1999; Kuittinen et al., 2002; Song & Mitchell-Olds, 2007; Table S2).
PCR, cloning, and sequencing
Polymerase chain reaction for amplifying all cpDNA and nuclear sequences was conducted in a similar way to that described by Wang et al. (2011). Sequencing reactions were performed using an ABI Prism Bigdye Terminator Cycle Version 3.1 Sequencing Kit and an ABI 3130xl or 3730xl Genetic Analyzer (PE Applied Biosystems, Foster City, CA, USA).
Purified amplified products of low-copy nuclear genes were initially directly sequenced. Products were also ligated into pGEM T-easy vector (TaKaRa, Dalian, China) and sequenced to yield five to 10 clones per gene per individual. Since Taq enzyme and cloning may produce errors or artificial mutations, we compared these sequenced clones with the directly sequenced fragments. In doing so, we excluded artificial or cloning-resulting polymorphisms (Palumbi & Baker, 1994; Eyre-Walker et al., 1998; Hilton & Gaut, 1998). In this way we were able to classify individuals as homozygous or heterozygous based on allelic composition at each locus. For each locus, the 88 sequences obtained across all 44 individuals were aligned using the default parameters of the CLUSTAL X 1.81 program (Thompson et al., 1997) and MEGA version 5.0 (Tamura et al., 2011), with further manual refinements. All sequences have been deposited in GenBank, and their accession numbers are KC823275-KC823603, HQ832505-HQ832509, and HQ832518-HQ832563.
We used DnaSP v. 5.00.04 to estimate nucleotide polymorphism at all sites in a sequence (Librado & Rozas, 2009). For each low-copy nuclear gene, we calculated the number of segregating sites (S), the number of haplotypes (Nh), haplotype diversity (He) and two parameters: π, the average number of nucleotide differences per site between two sequences in a sample (Nei, 1987); and θw (= 4Neμ, where Ne denotes effective population size and μ is the mutation rate per site per generation) from segregating sites (Watterson, 1975). We further estimated polymorphism levels in the ancestral species using the program BP&P, which assumes no gene flow between two current species (Rannala & Yang, 2003; Yang & Rannala, 2010). We assessed the minimum number of recombination events using Hudson & Kaplan's (1985) four-gamete test, and tested whether each locus fitted a neutral model using Tajima's D (Tajima, 1989), Fu and Li's D* and F* statistics (Fu & Li, 1993), and the Hudson–Kreitman–Aguadé multilocus test (HKA, Hudson et al., 1987). Sequences obtained from Megarpaea delavayi were used as outgroup in HKA tests. We further used the recently developed maximum frequency of derived mutations (MFDM) test (Li, 2011) to assess the fit of the data to the neutral equilibrium mode. In addition, we identified outlier SNPs that deviate from neutral expectations in a manner consistent with either divergent or balancing selection using the program BayeScan 2.1 (Foll & Gaggiotti, 2008). To do this, we conducted 20 pilot runs of 50 000 iterations with an additional burn-in of 500 000 iterations and a thinning interval of 20. Other parameters were set at the default values. Because recent studies suggest that BayeScan provides a conservative estimate of outlier loci (Buckley et al., 2012; Huang et al., 2012), loci with False Discovery Rate (FDR) q < 0.05 were considered to be outliers in this analysis.
Genetic differentiation between the two Pugionium species at each low-copy nuclear locus was assessed in two ways. First, an analog of Wright's fixation index, Fst (Excoffier et al., 1992), was estimated for each locus using AMOVA implemented in Arlequin v. 3.1.1 (Excoffier et al., 2005) with significance tested using 10 000 permutations as described in Excoffier et al. (1992). Secondly, STRUCTURE ver. 2.3 (Hubisz et al., 2009) was used to assess population structure using the admixture model with the assumption of correlated allele frequencies among clusters. This method is highly effective in detecting introgressed individuals between distinct groups owing to historical gene flow resulting from hybridization (Ostrowski et al., 2006). To estimate the number of clusters (K), values of K from 1 to 10 were explored using 20 independent runs per K. Burn-in was set to at least 50 000 followed by 500 000 iterations. The most likely number of K was estimated using the original method from Pritchard et al. (2000), and also the ⊿K statistic described in Evanno et al. (2005).
We used TCS to construct relationships between all recovered haplotypes at each locus (Clement et al., 2000). This program constructs haplotype networks by implementing the statistical parsimony algorithm described by Templeton et al. (1992). Its implementation followed the default parsimony connection limit of 95%, with gaps coded as fifth state and larger indels coded as binary characters, accounting for single mutation events. We also constructed genealogical trees of haplotypes with MEGA5 (Tamura et al., 2011), using the neighbor-joining (NJ) method on Kimura's two-parameter distances (Kimura, 1980). Bootstrap values were estimated to assess the relative support for relationships between haplotypes (1000 replicates; Felsenstein, 1985). These analyses included Megarpaea delavayi as outgroup.
Migration rate, effective population size, and divergence time were calculated based on the isolation with migration (IM) model using the IMa2 program (Wakeley & Hey, 1997; Nielsen & Wakeley, 2001; Hey & Nielsen, 2004, 2007; Hey, 2010a,b). The fit of data to simple demographic models was tested using the nested model approach in ‘Load-Tree’ mode (Hey & Nielsen, 2007). This calculates log-likelihood ratio statistics for different possible nested models, the significance of which can be assessed using a χ2 test. We tested five different divergence models by setting different gene flow parameters (m1, m2). Posterior probability densities for model parameters were assessed using Markov Chain Monte Carlo (MCMC) methods (Hey & Nielsen, 2004, 2007). The IM model assumed no selection, no recombination within loci, lack of substantial population structure, and random mating in ancestral and descendent populations (Hey & Nielsen, 2004, 2007). It is difficult to satisfy all of these assumptions in an empirical study; however, recent studies showed that parameter estimates of the IM model were robust to even high levels of population structure and to recombination (Carstens & Knowles, 2007; Strasburg & Rieseberg, 2010). We used the program IMgc (Woerner et al., 2007) to obtain the longest region without four gametic types for each locus. We also constructed two datasets for IM analyses: one included only genes that meet the expectations of the neutral model and another dataset that contained all genes. The length of the longest nonrecombining block ranged from 205 bp (Pgic1) to 615 bp (Det1), with an average length of c. 423 bp. We began with multiple runs of 10 000 steps (following 100 000 iterations as burn-in) to assess mixing and to fine-tune the parameter space. We then conducted the simulation for a burn-in of one million generations and five million steps under the HKY model of sequence evolution. Three independent runs were performed with different seed numbers to guarantee convergence of samples (Hey & Nielsen, 2004; Won & Hey, 2005). We also checked the mixing properties of MCMC by monitoring effective sample size (ESS) values, trend-line plots of the parameter, and swapping rates between chains. When independent runs produced similar posterior distributions, well-mixed runs were repeated to get reproducible results.
IMa estimates are quite stable with moderate violations of the IM model assumptions, but model selection of mutation rate has a significant impact on estimates of almost all parameters (Strasburg & Rieseberg, 2010), because all demographic parameters are scaled using the geometric mean of the mutation rate per locus yr–1. Therefore, we used the method of Ikeda et al. (2009) to estimate the mutation rate (μ) at silent sites for each locus. The value of μ for each nuclear gene was estimated from the ratio of the average total genetic divergence to the average synonymous genetic divergence (KTotal/KS). We calculated the mutation rate according to the formula μ = μCHS × KTotal/KS × L, where L is the length of the locus and μCHS is the substitution rate per synonymous site yr–1 of the CHS gene in Brassicaceae, estimated to be 1.5 × 10−8 substitutions per site yr–1 (Koch et al., 2000); the geometric average of KTotal/KS over all loci was 0.6662. Consequently, the geometric mean, 8.3 × 10−6, substitutions per locus yr–1, was used to scale the demographic parameters from IMa2.
In addition, we estimated divergence time based on ITS variation between the two species. This divergence was calculated as the average DNA sequence distance divided by twice the sequence mutation rate (μ), where μ was assumed to be 5.0–10.0 × 10−9 per site yr–1 from other genera of the same family (Koch & Al-Shehbaz, 2002).
ITS and cpDNA variation
As expected from our previous preliminary analysis of cpDNA sequence variation in Pugionium (Wang et al., 2011), we found no rbcL and trnH–psbA sequence variation among individuals examined in the current study. One nucleotide substitution was identified for matK, but the two haplotypes produced were shared by both species throughout their respective distributions. As also expected, one nucleotide substitution in the nuclear ITS fragment distinguished the two species, and based on the genetic distance between species for ITS and the substitution rate, μ = 5.0 − 10.0 × 10−9 per site yr–1, divergence between the two species was estimated to have occurred between 80 000 and 160 000 yr ago.
Nucleotide diversity and standard neutrality tests
Two sequences per individual were obtained for each of the eight low-copy nuclear genes. The total length of aligned sequences was 7008, with amplified fragments ranging in length from 580 to 1367 bp. The number of insertion–deletion (indel) polymorphisms ranged from 0 to 2 across loci, with a total of five indels identified. All indels were excluded from subsequent analyses.
Species-wide amounts of silent nucleotide variation (πsil) were highly variable among loci, ranging from 0.00013 (MPS33) to 0.01294 (Rps2) in P. cornutum and from 0.00182 (Chs) to 0.01299 (Rps2) in P. dolabratum (Table 1). Average silent nucleotide variation was slightly higher in P. dolabratum (πsil= 0.007 39, θsil= 0.00650) than in P. cornutum (πsil=0.006 44, θsil= 0.00565). Estimated ancestral polymorphism (θA= 0.00879) was shown to be higher than current nucleotide diversities within each species using the program BP&P. The minimum number of recombination events (Rm) ranged from one to 18 in P. cornutum and from three to 15 in P. dolabratum, with average estimates being similar in both species across loci (Table 1).
Table 1. Nucleotide polymorphism, haplotype diversity and neutrality tests within Pugionium cornutum and Pugionium dolabratum
N, total number of sequences; L, length in base pairs; S, number of segregating sites (number of singletons); π, nucleotide diversity (Nei & Li, 1979); θ, Watterson's parameter (Watterson, 1975); Nh, number of haplotypes; He, Nei's haplotypic diversity; D and D* ⁄ F*, Tajima's D (Tajima, 1989) and Fu & Li's D* and F* (Fu & Li, 1993).
Rm, estimate of the minimum number of recombination events (Hudson & Kaplan, 1985).
na, not applicable. Significant level: *, 0.01 ≤ P <0.05; **, 0.001 ≤ P <0.01; ***, P <0.001.
Values of Tajima's D and Fu and Li's D * and F * varied greatly across the eight loci, with certain values being significantly different from neutral expectation at Det1 (D * = 1.54856, P <0.05) and Pgic1 (D ** = 1.80967, P <0.01) in P. cornutum and Rps2 (D * = 1.41100, P <0.01) in P. dolabratum (Table 1). Given that the HKA test statistic for closely related species is not expected to follow the χ2 distribution (Machado et al., 2002), we repeated HKA tests for P. dolabratum and P. cornutum separately using one sequence of M. delavayi as outgroup. We compared the test statistics with a distribution generated from 10 000 coalescent simulations (Hilton et al., 1994) and did not detect significant values for either contrast (P. cornutum/M. delavayi, χ2 = 6.28, P =0.507; P. dolabratum/M. delavayi, χ2 = 4.04, P =0.775). However, Bayescan outlier tests detected some SNPs that had been subject to divergent selection between the two species, with most of them located within the Det1 and Mps33 genes (Fig. S3).
The MFDM test is a method recently developed by Li (2011) to detect recent positive selection based on the topology of a coalescent tree such that when a selection event occurs, it leads to an unbalanced tree close to the selected site. The test is free from the confounding effects of demographic history. It should be noted that interspecific introgression may distort this test (Li, 2011) and therefore most individuals in population 5 and 10, which may have been subject to introgression (column height > 50%) according to the results of STRUCTURE (Fig. S4), were excluded from analysis.
The MFDM test indicated that there was a significant probability (P <0.05) of selection having occurred at Mps33 (P =0.03571) and Det1 (P =0.03571) in P. cornutum (Fig. S2). As migration may also cause unbalanced trees, we used a migration detector (MD) to analyze this possibility (Li, 2011). For each locus in each species, we arbitrarily picked one individual from another species for the MD analyses. These analyses indicated that migration was not responsible for unbalanced trees.
Population structure and genetic differentiation
The STRUCTURE analysis indicated that the most likely number of clusters across all individuals was K =2 (Fig. S4). However, this division did not distinguish the two species completely, with admixed individuals present in some populations, especially two of P. cornutum (5 and 10; note that population 5 of P. cornutum is geographically close to population 15 of P. dolabratum) (Fig. 1). Genetic divergence (Fst) between and within species varied greatly across loci (Table S3) and increased when admixed individuals were excluded. AMOVA showed that variation between species varied across loci (Table 2), as did the partition of genetic variance within and between populations in each species. When admixed individuals from the overlapping distributions were excluded, the amount of interspecific variation increased (Table 2).
Table 2. AMOVA analyses of genetic partitions between and within Pugionium cornutum and Pugionium dolabratum
Significant level: *, 0.01 ≤ P <0.05; **, 0.001 ≤ P <0.01; ***, P <0.001.
Genealogy of haplotypes at each low-copy nuclear locus
Based on topology and haplotype frequency, the genealogies constructed for each low-copy nuclear locus could be grouped into two types (Figs 2, S5). Haplotype networks of one type were composed of approximately two clusters with most haplotypes being species-specific (Det1 and MPS33). By contrast, networks for the other loci (Chs, Cop1, Dpa1, Cip7, Pgic1, and Rps2) showed no obvious divergence between the two species.
Interspecific divergence and gene flow examined by Isolation-with-migration (IM) simulations
To examine whether genes that deviated from the neutral model would affect the outcome of IM analysis, we constructed two datasets (one including all loci and the other excluding Det1 and MPS33) for IM analyses. Simulations were repeated with the IMa2 program to reveal unambiguous marginal posterior probability distributions of the demographic parameters for each species. All parameters were converted by scaling for effective population size or years based on the average mutation rate across all eight loci (Table S4; Fig. 3).
Simulation analyses of the two datasets showed that the effective population size of P. dolabratum was larger than that for P. cornutum. Using the first dataset, the effective population size for P. cornutum was c. 33 704 individuals (95% highest posterior density (HPD) interval: c. 21 084–52 771 individuals), while for P. dolabratum it was 47 560 (95% HPD interval: c. 32 198–68 584 individuals). The marginal posterior densities of divergence parameter, t, showed a sharp peak at 1.210, which converted into a divergence time of c. 145 783 yr (95% HPD interval: 102 048–227 590 yr). For the second dataset, the divergence parameter peak increased to 1.075 (c. 129 510 yr; 95% HPD interval: c. 89 759–219 156 yr). The hypothesis of no gene flow between species was rejected in analyses of both datasets (Fig. 3; Tables 3, S4), and gene flow from P. dolabratum to P. cornutum was higher than in the reverse direction. Gene flow was estimated to be 2.136 from P. dolabratum to P. cornutum, and 0.809 from P. cornutum to P. dolabratum based on the two datasets. Thus the amount of asymmetric migration between these two species corresponded well with their effective population sizes, with the species having the smaller effective size, P. cornutum, being subject to a higher amount of introgression. In addition, the current effective population size of P. dolabratum and P. cornutum was larger than that of the ancestral species, which indicates a marked demographic expansion (Fig. 3).
Table 3. Tests of nested models for divergence between Pugionium cornutum and Pugionium dolabratum
Although the exact values varied slightly between the two different datasets used in the IM analysis, the findings were generally similar for both. First, from both datasets, demographic expansion was detected in P. dolabratum and P. cornutum when compared with the effective population size of the ancestral species. Secondly, interspecific divergence was estimated to have occurred c. 100 000 yr ago. Finally, there was asymmetric migration between the two species. It is worth noting, however, that the values of gene flow estimated using the first dataset (i.e. 2Nm = 1.141 from P. dolabratum to P. cornutum, and 2Nm = 0.234 from P. cornutum to P. dolabratum) were lower than those estimated from the second dataset (2Nm = 2.136 from P. dolabratum to P. cornutum, and 2Nm = 0.809 from P. cornutum to P. dolabratum). Thus, inclusion of the two nonneutral loci, Det1 and MPS33, in the first dataset may have had the effect of reducing the amount of gene flow determined by the IM simulation.
Our results provide strong evidence that two desert, sister plant species, P. cornutum and P. dolabratum, originated in northern China during the Pleistocene. Coalescence analyses conducted on sequence variation of low-copy nuclear genes, as well as sequence divergence of ITS, were consistent in showing that the two species of Pugionium, which differ greatly in growth form, morphology and habitat, diverged within the Pleistocene, that is, at a time when deserts expanded greatly in northern and northwest China (Höermann & Süssenberger, 1986; Yang, 2006). Thus, Pleistocene climate change, resulting in increased aridity, may have promoted speciation within the genus Pugionium in response to diversification of desert habitats during this period. Our analysis also showed that divergence between the two species at the molecular level differed according to the DNA sequences examined. Thus, two nuclear genes, Det1 and MPS33, which exhibited signatures of selection, showed greater interspecific divergence than other nuclear and cpDNA sequences investigated.
Interspecific divergence across DNA sequences and selection detection
Although the two Pugionium species differed in ITS sequence, as a result of a single nucleotide substitution, they shared alleles/haplotypes at the eight low-copy nuclear loci investigated and also for cpDNA. Thus, ITS seems to have undergone faster interspecific differentiation relative to the other sequences studied (Tables 2, S3, Fig. 1), which supports the recent suggestion that ITS is more effective for barcoding recently diverged plant species (China Plant BOL Group, 2011). A higher interspecific divergence and lineage sorting at ITS between closely related angiosperm species, and the increased power of ITS to distinguish between such species relative to cpDNA sequence variation, has been associated with the biparental inheritance of this sequence and its dispersal through seed and pollen rather than through seed alone, as is the case for maternally inherited cpDNA (Hollingsworth et al., 2011; Wang et al., 2011). However, the same explanation would not account for the lower interspecific divergence observed overall at the low-copy nuclear loci investigated, which also exhibit biparental inheritance.
Whereas interspecific divergence was absent or low at six of the eight low-copy nuclear loci investigated (i.e. Chs, Dpa1, Rps2, Pgic1, Cip7, Cop1), it was higher at the remaining two loci, Det1 and MPS33 (Table 2). For each of these latter two loci, two largely species-specific groups of haplotypes were recognized (Fig. 2). The higher interspecific divergence exhibited for Det1 and MPS33 might result from these genes being subject to past divergent selection. Indeed, both MFDM tests and outlier analysis showed that Det1 and MPS33 might have been subject to divergent selection within species (Figs S2, S3). Det1 is an essential negative regulator of plant light responses and plays a key role in controlling the expression of light-regulated genes, apical dominance, seed germination, chloroplast development and chalcone synthase (Chory & Peto, 1990; Pepper et al., 1994; Kuittinen et al., 2002; Schroeder et al., 2002). The intensity of light reflectance is different between plants growing on mobile and fixed dunes. The species, P. cornutum, on mobile dunes might have experienced stronger light stress, which largely corresponds to the increased divergence at the Det1 locus in this species. Although the function of MPS33 has not been clearly documented, it is highly possible that the relatively high interspecific divergence found at both loci might reflect selection for different alleles during adaptation of the two Pugionium species to their respective desert habitats.
Adaptation to different desert habitats is implied in the origin of the two species by the nature of the morphological and growth form differences between them. For example, the development of a long stem in P. cornutum possibly helps prevent plant burial on mobile dunes. However, as already pointed out, tests conducted on the low-copy nuclear genes examined did not reveal complete divergence for all loci (Fig. S3). Thus, based on current evidence, there is a marked contrast between the molecular and morphological divergence of the two species. A future comparative analysis of the genomes of these two species would reveal which parts of the genome are most divergent and likely, therefore, to control differences such as morphology and physiology that are important in the adaptation of the species to their respective habitats, and that were possibly important in their origin.
Based on the analyses of two low-copy nuclear gene datasets, one including all of the genes examined and another excluding the two nonneutral genes (Det1 and MPS33), divergence between P. dolabratum and P. cornutum is estimated to have occurred between c. 90 000 and 228 000 yr ago. This is very similar to the dates of species divergence based on ITS sequence variation, that is, 80 000–160 000 yr ago. These estimates imply that P. dolabratum and P. cornutum diverged in the Pleistocene at a time when deserts, and especially large mobile dunes, began to develop and expand in northwest and northern China (Höermann & Süssenberger, 1986; Yang, 2006). Indeed, it has been estimated that the extent of desert dunes in this region peaked c. 121 000 yr ago (Yang, 2004). Furthermore, global warming during the Holocene led to an increase in the fixed desert area in the region (Yang, 2006), which in turn may have caused a range expansion of P. dolabratum during this period.
Thus, increases in desertification and diversity of desert habitats during the Pleistocene may have acted as an effective stimulus to promote allopatric or parapatric divergence in Pugionium, leading to the origin of the two species we have studied. The ancestral theta values based on the BP&P analyses when no gene flow was taken into account are larger than the nucleotide diversity measured from the extant species. However, IM analyses suggested a larger effective population size in P. cornutum and P. dolabratum than in the ancestor when gene flow was considered. Both approaches suggest that the two species may have diverged in the presence of gene flow (Fig. 3; Tables 3, S4), although our analyses are unable to discriminate between gene flow occurring during speciation (e.g. under conditions of parapatry) or following secondary contact after initial allopatric divergence (Zhou et al., 2007; Strasburg & Rieseberg, 2011).
Overall, our population genetic analyses show that the two species of Pugionium diverged within the recent Pleistocene, possibly as a result of adaptation to the origin of divergent desert habitats. This work therefore provides a further example of speciation occurring during the Pleistocene, where climate change, in this case aridification, was likely to have been the underlying cause of speciation. It will be of interest in the future to analyze in detail the ways in which the two Pugionium spp. are adapted to their respective habitats and the genetic basis of adaptive differences.
We thank Huiying Shang and Miao Dong for their help in experimental work. The research was supported by grants from the National Natural Science Foundation of China (30725004) and a Royal Society-NSF China International Joint Project award 2010/R4 to R.J.A. and J.Q.L.