Phylogeographic analysis and environmental niche modeling of widespread shrub Rhododendron simsii in China reveals multiple glacial refugia during the last glacial maximum


  • Yong LI,

    1. ( College of Forestry, Henan Agricultural University, Zhengzhou 450002, China)
    Search for more papers by this author
  • Hai-Fei YAN,

    1. ( Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China)
    Search for more papers by this author
  • Xue-Jun GE

    Corresponding author
    1. ( Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China)
       Author for correspondence. E-mail:; Tel./Fax: 86-20-37252551.
    Search for more papers by this author

 Author for correspondence. E-mail:; Tel./Fax: 86-20-37252551.


Abstract  The phylogeography of common and widespread species can help us to understand the history of local flora and vegetation. Here, we study the semi-evergreen shrub Rhododendron simsii Planch., which is found in most areas of current evergreen broad leaved forest in China. Two noncoding chloroplast DNA (cpDNA) regions (rpl20-rps12 and trnL-F) and three amplified fragment length polymorphism (AFLP) primer sets (E-AAC/M-CTA, E-AGC/M-CTA and E-AGG/M-CAT) were used to examine the phylogeographic pattern in relation to past (last glacial maximum) and present distributions based on ecological niche modeling. The cpDNA data revealed four phylogeographic groups (East, South, West, and North groups) corresponding to geographic regions. Molecular dating suggests that lineage diversification within species likely occurred during the mid-to-late Pleistocene. In contrast, the four main cpDNA phylogeographic groups were not supported by the AFLP dataset. The highest likelihood of the AFLP data was obtained when samples were clustered into three groups (K= 3). However, these groupings did not correspond to separate geographic regions supported by cpDNA data. Both mismatch distribution analysis and environmental niche modeling (ENM) indicated that multiple glacial refugia were maintained across the range of Rhododendron simsii during the last glacial maximum, contrary to the previous hypothesis that subtropical broad leaved evergreen forests were forced to retreat southward as far as 25° N. The discordance between the patterns revealed by cpDNA and AFLP data indicate that localized postglacial range expansions may facilitate extensive gene flow between the major glacial refugia.

It is now well recognized that global climate fluctuations, in particular the climatic oscillations of the Quaternary period, resulted in repeated drastic environmental changes which profoundly shaped the current distributions and genetic structures of many plant and animal species (Hewitt, 2000, 2004). Pleistocene climatic fluctuations directly impacted areas such as Europe and North America, which have thus been the focus of thorough phylogeographical studies of both plants and animals (Comes & Kadereit, 1998; Hewitt, 2000; Soltis et al., 2006). The role of these climatic fluctuations in other regions, however, remains less well understood. This is particularly the case for the subtropical region in China. Subtropical China generally refers to the hilly mid-elevation areas of Eastern China between 22° and 34° N (e.g., Wang et al., 2009). In this region, the typical vegetation is evergreen broad leaved forest (EBLF; Wang & Jiang, 2000), which is distributed between 24–32° N and 99–123° E and covers about 25% of the land area of China (Song, 1988, 1995). In contrast to North America at similar latitude, this region has never been directly affected by extensive and unified ice-sheets (Shi et al., 1986; Liu, 1988). Nonetheless, it experienced severe climatic oscillations throughout the Quaternary period, with dramatic effects on the evolution and distribution of both plant and animals species (e.g., Axelrod et al., 1996; Millien-Parra & Jaeger, 1999; Harrison et al., 2001). As a result, this region served as one of the most important ancient relict areas for numerous Tertiary plant genera, which were more widely distributed in North America and/or Europe during the Tertiary period (65–2 Ma) but are now endemic in this region of China (Qiu et al., 2011). Although the EBLF region has never been directly covered by an ice sheet, the climate of this region was cooler by ca. 4–6 °C and drier by ca. 400–600 mm/year at the last glacial maximum (LGM, ca. 18 000 years ago; Zheng et al., 1998). The current composition and distribution of the EBLF is still influenced by the continuous climatic oscillations of the ice age and interglacial period in the Pleistocene (Cannon & Manos, 2003; Song et al., 2005).

There are two opposing views on forest responses to the Quaternary climatic changes in this region. Palaeovegetation data from East Asia have suggested that the EBLF and the mixed evergreen and deciduous broad-leaved forest (EDBLF) shifted southwards below 25 °N (An et al., 1990). It is thought that the EBLF retreated to the edge of the current tropical rain forest, based on fossils of pollen (Members of China Quaternary Pollen Data Base, 2000; Zheng, 2000). Another possibility is that some cold-adapted species in this region may have persisted throughout the LGM, and some species may have even occupied much larger areas than today (Su et al., 2005; Tian et al., 2010).

In the past ten years, most phylogeographical studies on the EBLF have focused on rare plants, for example, the Tertiary relict Cathaya argyrophylla Chun & Kuang (Wang & Ge, 2006), Ginkgo biloba L. (Gong et al., 2008), Eurycorymbus cavaleriei (Levl.) Rehd. & Hand.-Mazz. (Wang et al., 2009), Kirengeshoma palmata Yatabe and Kirengeshoma koreana Nakai (Qiu et al., 2009). Although these studies have suggested the occurrence of cryptic refugial isolation for numerous evergreen forest species in montane areas, the phylogeographic picture is still incomplete because of a current lack of large-scale genetic studies with extensive sampling throughout subtropical China. Compared to the relict species, phylogeographic data of common and widespread species can better reveal the possible refugia of the EBLF during the LGM and their recolonization routes after deglaciation.

Rhododendron simsii Planch. (Ericaceae) is a semi-evergreen shrub which is found at elevations of 500–1200 m a.s.l. and is widely distributed in sub-tropical China. As a typical component of EBLF, it occurs in most distribution areas of current EBLF in China. For the present study, we selected R. simsii as a model for inferring phylogeographic patterns in the sub-tropical region as an aid to understanding migration of EBLF in China following historical climatic oscillations.

In this study, we used two noncoding chloroplast regions (rpl20-rps12 and trnL-F), three AFLP primer sets (E-AAC/M-CTA, E-AGC/M-CTA and E-AGG/M-CAT) together with ecological niche models to examine the phylogeographic pattern of R. simsii. Our specific objectives were to address the following questions: (i) what is the genetic structure of R. simsii populations in China as revealed by cpDNA variation and AFLP data; and (ii) whether multiple glacial refugia were maintained across the range of the species during the last glacial maximum, or the species was forced to retreat southward as far as 25°N.

1 Material and methods

1.1 Population sampling

Silica-dried samples of leaf material were obtained from 32 populations distributed in 13 provinces, namely, Guangxi, Guangdong, Hunan, Hubei, Anhui, Jiangxi, Henan, Sichuan, Chongqing, Guizhou, Yunnan, Fujian, and Taiwan, representing almost the entire natural distribution area of R. simsii in China. All individuals at least 10 m apart were sampled for each population. The geographic information regarding Chinese populations and numbers of individuals used in the cpDNA and AFLP analyses are presented in Table 1.

Table 1.  Details of population locations, sample size, amplified fragment length polymorphism (AFLP) and cpDNA variation of Rhododendron simsii sampled in China
Population No. and codeLocationLatitude (N) / Longitude (E)cpDNAAFLP
N Haplotypes (No. of individuals)π× 10−3 h N PPF H E
  1. Sample size (N) is indicated for the cpDNA and AFLP analysis separately; —, not analyzed; PPF, percentage of polymorphic fragments; HE, Nei's (1973) measure of gene diversity; h, haplotype diversity; π, nucleotide diversity.

1. YNMLMile, Yunnan23°24′/103°18′5H8 (5)00
2. GZBJBijie, Guizhou27°08′/104°52′5H8 (5)00577.80.191
3. GZTRTongren, Guizhou27°42′/109°13′5H2 (1), H8 (4)0.270.400578.90.230
4. CQNCNanchuan, Chongqing29°09′/107°06′7H8 (7)00513.50.121
5. SCLSLeshan, Sichuan29°24′/103°17′9H8 (9)00576.80.172
6. CQJYMt. Jinyun, Chongqing29°29′/106°13′5H8 (5)00581.60.199
7. HBJMJingmen, Hubei31°01′/112°11′5H2 (1), H8 (4)0.270.400
Regional mean    0.080.114 65.720.183
8. YNYSYanshan, Yunnan23°34′/104°17′6H2 (6)00513.00.102
9. YNGNGuangnan, Yunnan23°57′/105°02′5H2 (5)00517.30.134
10. GXLZLiuzhou, Guangxi24°18′/109°24′5H2 (5)00
11. GXHCHechi, Guangxi24°41′/108°03′5H2 (5)00580.50.182
12. GXGLGuilin, Guangxi25°11′/110°12′6H1 (2), H2 (4)0.360.533578.90.180
13. GZGYGuiyang, Guizhou26°43′/106°35′5H2 (5)00576.80.174
14. HNHTHuitong, Hunan26°52′/109°43′6H2 (6)00579.50.205
15. JXJGMt. Jinggang, Jiangxi26°34′/114°09′6H2 (4), H6 (2)0.730.533580.50.173
16. HNHSMt. Heng, Hunan27°15′/112°39′6H2 (5), H6 (1)0.450.333577.30.176
17. HNCSChangsha, Hunan28°12′/113°00′5H2 (5)00578.90.187
18. HNZJZhangjiajie, Hunan29°04′/110°44′5H2 (5)00581.10.184
Regional mean    0.140.127 66.380.170
19. GDHSHeishiding, Guangdong23°31′/111°52′5H4 (5)00
20. GDMZMeizhou, Guangdong24°16′/116°07′5H3 (5)00
21. FJPHPinghe, Fujian24°17′/117°13′4H3 (4)00577.80.170
22. GDLCLechang, Guangdong25°13′/113°17′11H2 (1), H3 (8), H5 (2)1.020.473585.90.222
Regional mean    0.260.118 81.850.196
23. TWDTMt. Datun, Taiwan25°06/121°19′4H6 (4)00582.20.221
24. FJMDMt. Mangdang, Fujian26°34′/118°16′5H6 (5)00580.50.208
25. FJPCPucheng, Fujian27°54′/118°32′5H6 (5)00517.80.141
26. JXSQMt. Sanqing, Jiangxi28°48′/117°54′4H6 (4)00
27. JXLSMt. Lu, Jiangxi29°25′/115°52′7H6 (7)00580.00.169
28. AHHSMt. Huang, Anhui29°42′/118°18′6H6 (6)00579.50.162
29. ZJTMMt. Tianmu, Zhejiang30°18′/119°10′5H6 (3), H7 (2)0.410.600580.50.173
30. AHTZMt. Tianzhu, Anhui30°46′/116°31′5H6 (5)00577.30.188
31. HBMCMacheng, Hubei31°14′/115°03′8H6 (8)00518.90.151
32. HNJGMt. Jigong, Henan32°07′/114°04′7H6 (7)00578.90.177
Regional mean    0.040.060 66.180.177
Species mean    0.1280.102 67.370.177
Total    0.1060.749 65.950.203

1.2 DNA sequencing and AFLP fingerprinting

Total DNA was extracted from roughly 30 mg of dried leaf tissue using a Plant Genomic DNA Kit (QIAGEN, Valencia, CA, USA) according to the manufacturer's protocol. After preliminary screening, we chose rpl20-rps12 (Hamilton, 1999) and trnL-F (Taberlet et al., 1991) intergenic spacers for the full survey because they contained the most polymorphic sites. Polymerase chain reaction (PCR) was performed in a reaction volume of 30 μL containing 30 ng of genomic DNA, 0.2 mmol/L of each dNTPs, 0.3 μmol/L of each primer, 3 μL Taq buffer and 1 unit of Taq polymerase (Takara). The PCR protocols involved initial denaturation for 4 min at 94 °C followed by 35 cycles of 40 s at 94 °C, 45 s at 48 °C (rpl20-rps12) and 50 °C (trnL-F), 90 s at 72 °C and a final extension step of 8 min at 72 °C. The PCR products were purified with an E.Z.N.A. Gel Extraction Kit (Omega Bio-Tek, Winooski, VT, USA) and then sequenced on an ABI 3730 DNA Sequence Analyzer at Shanghai Invitrogen Biotechnology Co., Ltd (Shanghai, China).

The AFLP analysis followed the procedure reported by Vos et al. (1995) with modifications. After the 12 primer pairs were screened, three were chosen for the analyses because they gave clear reproducible bands. The extracted DNA was digested with EcoRI and MseI and ligated to double-stranded EcoRI and MseI adapters (Vos et al., 1995). Three primer sets (E-AAC/M-CTA, E-AGC/M-CTA and E-AGG/M-CAT) were used for selective amplification. The EcoRI selective primers were 5′-fluorescent-labelled (D4-PA). For each individual sample, 3 μL of 10-fold diluted selective amplification product was suspended in 15 μL of HiDi formamide with 0.1 μL Standard-600 internal size standard and run on a Beckman Coulter CEQ 8000 Genetic Analysis System (Beckman Coulter Inc., Fullerton, CA, USA).

1.3 Past and current distribution inferences

To investigate the effect of cold periods (such as during the LGM) on the distribution of R. simsii, we inferred the distribution range using ecological niche models. Assuming the species did not change climatic preference, the range of R. simsii at the LGM was reconstructed according to the current distribution by implementing a maximum entropy model (Maxent 3.1.0; Phillips et al., 2006). The Maxent software was considered to be more robust than other methods for predicting past and present species distribution (see Flanders et al., 2011). We used a set of 100 presence points covering the whole distribution range of R. simsii in China: 68 points were obtained from the Chinese Virtual Herbarium (CVH, and 32 points were from sampling sites. Current bioclimatic variables and LGM data were downloaded from the WorldClim database (, Hijmans et al., 2005). Last glacial maximum data were simulated using the Community Climate System Model (CCSM; Collins et al., 2006); 19 environmental parameters were employed to model the potential distribution of the species. To test the performance of each model, 20% of the data in each run was randomly selected by Maxent and compared with the model output generated with the remaining data. The area under the receiver operating characteristic curves (area under curve, AUC) was used to compare model performance (Phillips et al., 2006).

1.4 Data analysis

The AFLP raw data were scored and converted to a presence/absence matrix according to the presence or absence of peaks identified by the CEQ8000 fragment analysis software (Beckman Coulter Inc.). Any AFLP fragments that did not differ by an average of 1.5-nucleotide differences were grouped and automatically scored by the program. Only bands between 60 bp and 600 bp which appeared above 1000 relative fluorescent units were scored to minimize scoring false (artifact) bands. The following population genetic analyses were performed on the presence/absence matrix of AFLP fragments. To assess levels of genetic diversity, we calculated the percentage of polymorphic fragments (PPF) and Nei's (1973) gene diversity (HE) for each population and at the species level following the method of Lynch & Milligan (1994) using AFLPSURV (version 1.0; Vekemans, 2002).

To infer the most likely number of population genetic clusters (K) in the AFLP dataset, we used a Bayesian analysis of population structure as implemented in STRUCTURE (version 2.2; Pritchard et al., 2000). The K value ranged from 1 to 10, with 10 replicates performed for each value of K and employing a burn-in period of 2 × 105 and 5 × 104 Monte Carlo and Markov Chains (MCMC). The “no admixture model” and independent allele frequencies were chosen for this analysis. The most likely number of clusters was estimated according to the model values (ΔK) based on the second order rate of change, with respect to K, of the likelihood function (see Evanno et al., 2005). Genetic divergence among populations was inferred from Nei's (1973) estimator of population substructure (GST), as well as from ΦST obtained from non-hierarchical analyses of molecular variance (AMOVAs) in ARLEQUIN (version 3.1; Excoffier et al., 2005) using Euclidean distances among AFLP phenotypes. Hierarchical AMOVA was also used to quantify the partitioning of AFLP variance between regional groups of populations (ΦCT) and between populations within such groups (ΦSC). Significance levels of Φ-statistics were based on 10 000 permutations (Excoffier et al., 1992).

Chloroplast sequences were edited manually based on the measured chromatograms and aligned using CLUSTAL_X (version 1.81; Thompson et al., 1997). After alignment, the two identified indels were coded as substitutions following the method of Caicedo & Schaal (2004). Haplotype diversity (h), and nucleotide diversity (π) were calculated for each population (hS, πS) and at the species level (hT, πT) using DNASP (version 4.0; Rozas et al., 2003). To test for phylogeographic structure of haplotype variation across the distribution of the species, PERMUT2 as described by Pons & Petit (1996) ( was used to calculate GST and NST. To identify groups of populations that were geographically homogenous and maximally differentiated from each other, spatial analysis of molecular variance was carried out using SAMOVA 1.0 (Dupanloup et al., 2002; Genetic relationships among the 32 populations of R. simsii were revealed using the procedures NEIGHBOR and CONSENSE within the program PHYLIP (version 3.63; Felsenstein, 2004). Phylogenetic relationships between cpDNA haplotypes were assessed by performing neighbor-joining (NJ) in PAUP* (version 4.0 beta10; Swofford, 2002) using Rhododendron championae Hook. as an outgroup; genealogical relationships between cpDNA haplotypes were inferred from a statistical parsimony network calculated in TCS (version 1.21; Clement et al., 2000), which showed all linkages between haplotypes with a > 95% probability of being most parsimonious. Determination of the divergence times of lineages within species could help one understand the historic events which the species have involved. Due to lack of reliable fossil records in the genus Rhododendron, the average divergence time was estimated via calibration of the molecular clock implemented in MEGA (version 5.1; Tamura et al., 2011). Therefore, 1.01 × 10−9 substitutions per site per year in seed plants for synonymous sites of cpDNA were taken (Graur & Li, 1999).

To determine whether the cpDNA sequences satisfied the assumption of neutrality, we calculated Tajima's D (Tajima, 1989), Fu & Li's D* (Fu & Li, 1993) and Fu's Fs (Fu, 1997) values for the entire species and groups of populations, using DNASP. Statistical significance of D, D* and Fs was estimated with coalescent simulations as implemented in this program. In general, departures of these statistics from zero indicate deviation from standard neutrality, but might also be taken as evidence of recent demographic expansions or population bottlenecks in the case where markers are otherwise assumed to be independent of selection (Tajima, 1989; Fu, 1997). To further infer demographic processes, we explicitly tested the null hypotheses of constant population size in DNASP by comparing observed and expected distributions of pairwise sequence differences (mismatch distributions).

For both AFLP and cpDNA data, tests of isolation by distance (IBD) were performed by regressing values of ΦST against the geographic distance (Km) with the Mantel permutation procedure as implemented in IBD (Jensen et al., 2005; Isolation by Distance Web Service: BMC Genetics 6: 13. v.3.16 Finally, a pollen/seed migration ratio (r) was calculated using a modified equation of Ennos (1994), according to Petit et al. (2005), with AMOVA-derived ΦST values (instead of GST) taken as estimators of population differentiation: inline image, where mp is the pollen migration rate, ms is the seed migration ratio, ΦST(n) is the nuclear (AFLP) ΦST, and ΦST(c) is the cytoplasmic (cpDNA) ΦST.

2 Results

2.1 The cpDNA diversity and population structure

Out of the two cpDNA-IGS regions sequenced in R. simsii (182 individuals, 32 populations), one showed length variation (rpl20-rps12: 673–682 bp), whereas trnL-F had a uniform length of 784 bp. When combined, these sequences (1457–1466 bp) were aligned with a consensus length of 1466 bp, and contained six nucleotide substitutions and one indel of 9 bp. Based on these polymorphisms, eight cpDNA haplotypes (H1–H8) were identified among all the samples surveyed (Table 2). The six rpl20-rps12 and three trnL-F haplotype sequences have been deposited in the GenBank database under accession numbers GQ184582-GQ184587 and GQ184590-GQ184592, respectively. Among the eight haplotypes detected, the most widespread haplotypes were H2 (in 14 of 32 populations), H6 (in 12 of 32 populations) and H8 (in 7 of 32 populations). The geographical distribution of haplotypes H1–H8 and their occurrence at each locality are shown in Fig. 1B and Table 1. Only seven of the 32 populations were polymorphic (P3, P7, P12, P15, P16, P22 and P29), whereas the other populations exhibited only one particular haplotype. The statistical parsimony network of haplotypes H1–H8 revealed that they were only one to five mutational steps apart. Haplotypes H1 and H2 were likely to represent ancestor haplotypes, while H3 and H7 were probably derived (Fig. 1: C).

Table 2.  Chloroplast DNA sequence polymorphisms detected in two intergenic spacer (IGS) regions of Rhododendron simsii identifying eight haplotypes (H1–H8)
HaplotypeNucleotide position
rpl20-rps12 trnL-F
  1. a All sequences are relative to the reference haplotype H1. Numbers 1/0 in sequences denote presence/absence of length polymorphism. a, ATCTTAAAT.

Figure 1.

Geographic distribution and genealogical relationships of cpDNA haplotypes recovered from Rhododendron simsii populations in China. A, Distribution ranges of R. simsii (red lines) in China. Map obtained from software ARC-GIS. B, Sampling localities of 32 populations of R. simsii and the geographic distribution of eight cpDNA haplotypes (H1–H8) detected. N, North group; W, West group; S, South group; E, East group. Map obtained from software DIVA-GIS. C, The statistical parsimony network of cpDNA haplotypes H1–H8. The area of circles corresponds to the frequency of each haplotype. Each solid line represents one mutational step interconnecting two haplotypes for which parsimony is supported at the 95% level. 1. Mile, Yunnan; 2. Bijie, Guizhou; 3. Tongren, Guizhou; 4. Nanchuan, Chongqing; 5. Leshan, Sichuan; 6. Mt. Jinyun, Chongqing; 7. Jingmen, Hubei; 8. Yanshan, Yunnan; 9. Guangnan, Yunnan; 10. Liuzhou, Guangxi; 11. Hechi, Guangxi; 12. Guilin, Guangxi; 13. Guiyang, Guizhou; 14. Huitong, Hunan; 15. Mt. Jinggang, Jiangxi; 16. Mt. Heng, Hunan; 17. Changsha, Hunan; 18. Zhangjiajie, Hunan; 19. Heishiding, Guangdong; 20. Meizhou, Guangdong; 21. Pinghe, Fujian; 22. Lechang, Guangdong; 23. Mt. Datun, Taiwan; 24. Mt. Mangdang, Fujian; 25. Pucheng, Fujian; 26. Mt. Sanqing, Jiangxi; 27. Mt. Lu, Jiangxi; 28. Mt. Huang, Anhui; 29. Mt. Tianmu, Zhejiang; 30. Mt. Tianzhu, Anhui; 31. Macheng, Hubei; 32. Mt. Jigong, Henan.

The haplotype diversity and nucleotide diversity were hT= 0.749 and πT= 1.06 × 10−3, respectively. The highest nucleotide diversity and haplotype diversity were found in P22 and P29, respectively (Table 1). A permutation test for phylogeographic structure of haplotype variation across the entire distribution of the species showed that NST (0.897) was significantly higher than GST (0.865) (P < 0.01), indicating that closely related haplotypes were more likely to co-occur in the same region.

The SAMOVA analysis revealed a progressive increase in FCT until K= 6. When K= 3, FCT was smaller than for K= 4, and as K increased the overall group structure was lost, i.e., populations with a large proportion of a single haplotype appeared as separate clusters. Therefore, we used K= 4 as the best grouping scheme. The compositions of the groups for K= 4 were in close agreement with the geographical distribution of chloroplast haplotypes. The four groups identified were as follows: populations 1–7 in the North group, 8–18 in the West group, 19–22 in the South group and 23–32 in the East group. The subdivision was also supported by NJ clusters of populations based on cpDNA data (Fig. 2). Twenty-four out of the 32 populations (75% all of populations) exhibited a single haplotype. Among the eight polymorphic populations, only population 22 in the South region harbored three haplotypes, and the other seven populations comprised two haplotypes (Fig. 1).

Figure 2.

Neighbor-joining (NJ) clustering of 32 populations of Rhododendron simsii based on ΦST of cpDNA markers among populations. 1. Mile, Yunnan; 2. Bijie, Guizhou; 3. Tongren, Guizhou; 4. Nanchuan, Chongqing; 5. Leshan, Sichuan; 6. Mt. Jinyun, Chongqing; 7. Jingmen, Hubei; 8. Yanshan, Yunnan; 9. Guangnan, Yunnan; 10. Liuzhou, Guangxi; 11. Hechi, Guangxi; 12. Guilin, Guangxi; 13. Guiyang, Guizhou; 14. Huitong, Hunan; 15. Mt. Jinggang, Jiangxi; 16. Mt. Heng, Hunan; 17. Changsha, Hunan; 18. Zhangjiajie, Hunan; 19. Heishiding, Guangdong; 20. Meizhou, Guangdong; 21. Pinghe, Fujian; 22. Lechang, Guangdong; 23. Mt. Datun, Taiwan; 24. Mt. Mangdang, Fujian; 25. Pucheng, Fujian; 26. Mt. Sanqing, Jiangxi; 27. Mt. Lu, Jiangxi; 28. Mt. Huang, Anhui; 29. Mt. Tianmu, Zhejiang; 30. Mt. Tianzhu, Anhui; 31. Macheng, Hubei; 32. Mt. Jigong, Henan.

Non-hierarchical AMOVA (Table 3) revealed a strong population genetic structure for cpDNA sequence variation at the entire species-range scale (ΦST= 0.866, P < 0.001). Most of this differentiation, however, was partitioned among the four regions (ΦCT= 0.868); only 2.7% was explained among populations within each region (ΦSC= 0.203) (Table 3). Significant IBD for cpDNA was detected at the entire species-range scale (r= 0.499, P < 0.001), whereas no significant IBD was present in each sub-region (North: r= 0.129, P= 0.229; West: r= 0.091, P= 0.271; South: r= 0.589, P= 0.092; East: r=−0.094, P= 0.456). Among the four groups, there was greater population subdivision in the South group (ΦST= 0.337) than the other three groups (North: ΦST= 0.012; West: ΦST= 0.114; East: ΦST= 0.274).

Table 3.  Non-hierarchical and hierarchical AMOVAs for AFLP and cpDNA variation surveyed in populations of Rhododendron simsii in China.
Source of variation d.f. % Total varianceΦ-statistic P-value
Non-hierarchical AMOVAs
  Total2523.6%ΦST= 0.236<0.001
  North418.8%ΦST= 0.188<0.001
  West926.7%ΦST= 0.267<0.001
  South127.3%ΦST= 0.273<0.001
  East819.6%ΦST= 0.196<0.001
Hierarchical AMOVAs
  Among four groups31.7%ΦCT= 0.016>0.05
  Among populations2222.3%ΦSC= 0.227<0.001
  Within populations10476.0%ΦST= 0.240<0.001
Non-hierarchical AMOVAs
  Total3186.6%ΦST= 0.866<0.001
  North61.21%ΦST= 0.012>0.05
  West1011.4%ΦST= 0.114<0.05
  South333.7%ΦST= 0.337<0.05
  East927.4%ΦST= 0.274<0.05
Hierarchical AMOVAs
  Among four groups386.8%ΦCT= 0.868<0.001
  Among populations282.7%ΦSC= 0.203<0.001
  Within populations15010.6%ΦST= 0.895<0.001

Tajima's D, Fu and Li's D* and Fu Fs statistics for deviation from neutrality were non-significant for each geographic region and the whole species (all P > 0.05). However, the observed mismatch distributions of haplotypes from each region and for the whole range did not differ significantly from mismatches expected for models of constant population size (all P > 0.05). According to the result of calibrating molecular clock, the mean divergence time of nodes ranged from 0.97 Ma (node A) to 0.34 Ma (node G) (Fig. 3), suggesting that the divergence of the species falls into the mid-to-late Pleistocene.

Figure 3.

Neighbor-joining (NJ) clustering of eight haplotype sequences of Rhododendron simsii. Rhododendron championae Hook. was used as an outgroup.

2.2 Patterns of AFLP variation and population structure

A total of 130 individuals of R. simsii from 26 populations were successfully scored for the three AFLP primer combinations used, resulting in 185 unambiguous fragments with a range of sizes from 60 to 600 bp. Of these fragments, a total of 122 were polymorphic, i.e., the percentage of polymorphic fragments (PPF) for this species was 65.9%. Within-population gene diversity (HE) ranged from 0.102 to 0.230, with an average of 0.181, whereas total (species-wide) (HT) was estimated to be 0.203. The highest and lowest HE values were observed in P3 and P8. In the Bayesian analysis of population structure (Fig. 4), the highest likelihood of the data was obtained when samples were clustered into three groups (K= 3) (data not shown): I group (including population 4–6, 15–17, 21, 25, 27–32); II group (including population 8, 9, 11, 12, 22–24); and III group (2, 3, 13, 14, 18). However, the resulting groupings did not correspond to separate geographic regions that were supported by cpDNA data.

Figure 4.

Estimated genetic structure for K= 3 obtained with the program STRUCTURE (Pritchard et al., 2000) for 26 populations of Rhododendron simsii based on amplified fragment length polymorphism (AFLP) variation. Each vertical bar represents an individual and its assignment proportion (not shown) into one of three (colored) population clusters (K).

Non-hierarchical AMOVA, as well as Nei's (1973) estimator of population substructure (GST), indicated high levels of population differentiation for AFLP in R. simsii at the entire species range scale (ΦST= 0.236; GST= 0.240; Table 3). Despite taking into account the species’ weak hierarchical (regional) substructure (ΦCT= 0.016), overall levels of population divergence still remained high (ΦSC= 0.227, P > 0.05; Table 3). Population subdivision in the South group (ΦST= 0.273), however, was markedly higher than in the other regions (North: ΦST= 0.188; West: ΦST= 0.267; East: ΦST= 0.196). Mantel tests of IBD revealed significant correlation between AFLP differentiation among populations ΦST and geographical distance throughout the full range of R. simsii in China (r =0.384, P < 0.001). Similarly, significant IBD was observed among populations of the West (r= 0.648, P= 0.002) and East groups (r= 0.474, P= 0.013); the North group, however, exhibited no significant IBD (r= 0.057, P= 0.459). The South group was not included in the IBD analysis due to its small population number (i.e., 2).

Using the AFLP-derived ΦST(n) value of 0.236 across all the populations surveyed (see above), and the corresponding value for cpDNA, ΦST(c)= 0.866, the pollen/seed migration ratio (r) was calculated as 18.9, indicating that pollen flow was significantly higher than seed flow.

2.3 Distribution inference by ecological niche modeling

The inferred current and past (LGM) distributions of R. simsii are illustrated in Fig. 5. The AUC values based on both training and test presence data for the present and at the LGM were all higher than expected by chance (0.993 and 0.989, 0.992 and 0.991, respectively), demonstrating good model performance. It is notable that the current distribution model indicated that R. simsii is experiencing habitat fragmentation, which was isolated by intervening unsuitable habitat (see Fig. 5: A). Compared with the two simulated distributions, the LGM distribution model predicted the habitat to be smaller than the potential current distribution in most of the distribution area, apart from the West group, and the areas most suitable for habitation (red area) were isolated; the northern edge (Hubei and Shaanxi) and the eastern edge (Anhui, Zhejiang and Fujian) had obviously retreated during the LGM (see Fig. 5: B).

Figure 5.

Maps showing potential distribution as a probability of occurrence by ecological niche models (ENM) using bioclimatic variables. Distribution maps of Rhododendron simsii were mapped through software DIVA-GIS (grey, 0–0.2; green, 0.2–0.4; yellow, 0.4–0.6; orange, 0.6–0.8; red, 0.8–1.0) for R. simsii in China. A, the present. B, at the Last Glacial Maximum (LGM; 21 thousand years before present (ka)) based on the CCSM model.

3 Discussion

3.1 Genetic diversity and spatial population genetic structure

Generally, geographic distribution, breeding system and size of population all affect genetic diversity in plant species (Hamrick & Godt, 1989). Rhododendron simsii is an out-crossing, animal-pollinated and geographically widespread species (Ng & Corlett, 2000), which would be expected to have a relatively high genetic diversity. Unexpectedly, however, the species-wide level of haplotype diversity in R. simsii (hT= 0.749) was lower than the average level of ten other seed plants which are used as maternally inherited markers in China (see Qiu et al., 2011; mean hT= 0.817). The species-wide level of AFLP-derived genetic diversity (HT= 0.203) was also lower compared with the average level of other plant species assessed by AFLP markers (see Nybom, 2004; mean HT= 0.230). The current distribution predicted by ecological niche modeling suggested R. simsii is experiencing habitat fragmentation due to the climate. Habitat fragmentation was probably a key contributory factor causing the loss of genetic diversity of R. simsii, resulting in its lower genetic diversity than the average level of other plant species.

Both chloroplast and AFLP data demonstrated significant population differentiation within R. simsii. Population subdivision based on cpDNA data (ΦST= 0.866) was higher than the average level of ten other seed plants for maternally inherited markers in China (see Qiu et al., 2011; mean ΦST= 0.813). Another striking feature of R. simsii was the marked regional differentiation (ΦST= 0.868). In contrast to the significant phylogeographical structure obtained with cpDNA, only a moderate level of genetic differentiation and phylogeographical structure was suggested by the AFLP variation when examined over all populations (ΦST= 0.236).

Generally, restricted gene flow, drift, and inbreeding tend to increase genetic differentiation among populations (Jacquemyn et al., 2004). The seeds of R. simsii are small with wing-like structures and are dispersed by wind. However, their dispersal distance is reportedly only 30–80 m, and most of the seeds are spread within the same population (Ng & Corlett, 2000). Limited seed-mediated gene flow might be one important factor causing the high genetic differentiation observed among populations of R. simsii. The pollen of Rhododendron is transmitted by bees, and the dispersal distance has been measured as 3–10 km (Ng & Corlett, 2000). The calculated pollen to seed migration ratio (r) for R. simsii (r = 18.9) was slightly higher than the corresponding average value reported for other seed plant species (median r≈ 17, estimated over 93 species, Petit et al., 2005; Hodgins & Barrett, 2007). The AFLP data supported the clustering of populations into three groups, each group including samples from two or more separate cpDNA geographic groups. The results suggest that a high level of pollen-mediated gene flow has weakened the boundaries between the four regions. A significant IBD pattern was indicated by both the AFLP (r =0.384, P < 0.001) and cpDNA (r =0.499, P < 0.001) analyses, suggesting restricted gene flow at the whole species level. However, in the isolated sub-regions, the variation in degree of genetic diversity with increasing geographic distance was statistically non-significant according to the cpDNA data. This phenomenon was also observed based on the AFLP data for the North group (r= 0.0568, P= 0.459). Overall, these results suggest gene flow was enhanced over a narrow range.

3.2 Inference of phylogeographical history in R. simsii

As subtropical China was not directly covered by an ice sheet during the Pleistocene, there are two hypotheses concerning the occurrence of refugia and postglacial expansion in this region. One view is that EBLF migrated to the tropical South (≤25°N) without leaving forest stands further north (An et al., 1990; Yu et al., 2000; Harrison et al., 2001). The second hypothesis is that some cold-adapted species may have persisted in the northern region throughout the LGM and occupied multiple localized glacial refugia, as suggested by some phylogeographical studies in this region on species such as Cathaya argyrophylla (Wang & Ge, 2006), Taxus wallichiana Zucc. (Gao et al., 2007), and Eurycorymbus cavaleriei (Wang et al., 2009).

More comprehensive studies of species with wide distribution ranges are needed to test these two hypotheses. Here, we sequenced two intergenic chloroplast spacers (1466 bp) in populations from across the entire R. simsii distribution area in China. The partitioning of genetic variability had a significant geographic component and, despite some incongruence with AFLP markers, four major groups could be distinguished (Fig. 1). Phylogenetic analyses and SAMOVA of cpDNA haplotypes also supported this subdivision. Three of the groups were broadly distributed geographically (North group, West group and East group), while the South group only comprised a small number of populations (i.e., 4). The mismatch distribution and the positive Tajima's D, Fu & Li's D* and Fu Fs values did not support significant population expansion. The heterogeneous haplotype composition and genetic structure may thus be attributed to range fragmentation. According to our divergence time analysis, the divergence of the species most likely falls into the mid-to-late Pleistocene, about 0.97–0.34 Ma, coinciding with the most recent violent uplifts of much of the Qinghai-Tibet Plateau (ca. 1.1–0.6 Ma; Li et al., 1979), which initiated widespread mountain glaciations (Li et al., 1991; Zheng et al., 1998; Owen, 2009), and this likely continued until 0.17 Ma (Zheng et al., 1998; Zhang et al., 2000; Shi, 2002). Molecular clock analysis of chlorotype variation agreed with the phylogenetic analyses in indicating subdivision of the species during the early Pleistocene rather than during the LGM. Thus, we presumed the subdivision of the four regions was a result of allopatric fragmentation in the past history.

The allopatric fragmentation may be a consequence of isolation that could either be geographic or environmental. However, there was no obvious geographic barrier between the four groups. Therefore, R. simsii does not appear to be geographically isolated, allowing ecological niche modeling to be used in the assessment of species status. Ecological niche models predicted that the habitat of R. simsii was compressed during the LGM, there were multiple isolated glacial refugia (red area, Fig. 5) maintained across the range of R. simsii during the last glacial maximum. The response to impact of cold times and warm times on the distribution of R. simsii was validated in the simulation of ecological niche modeling, although we only used the simulated environment of LGM and current (Fig. 5). Taken together, a scenario of repeated glacial compression followed by interglacial expansion locally is the most likely for R. simsii during the Quaternary climatic changes.

Although ecological niche models predicted that most of the habitat of R. simsii was compressed during the LGM, the West group is an exception. The Sichuan Basin is adjacent to the West group, and was warm and humid, thus able to provide a much larger refugium for the survival of more plant species during climate oscillation (i.e., Yan et al., 2005; Gao et al., 2007; Chen et al., 2008). The cluster discordance between the patterns revealed by cpDNA and AFLPs data indicate that localized postglacial range expansions may facilitate extensive gene flow between the major glacial refugia. The structure of no admixture detected by STRUCTURE analysis based on AFLP (see Fig. 4) supports this suggestion. As could be expected, the potential contact zone during the LGM was found to exhibit an admixture of haplotypes (P3, P7, P15, P16, and P22) and high genetic diversity (P2, P6, P17, P18 and P24) as revealed by the cpDNA and AFLP data.

In summary, our data suggest that multiple glacial refugia were maintained across the range of Rhododendron simsii during the last glacial maximum, contrary to the previous hypothesis that subtropical broadleaved evergreen forests were forced to retreat southward as far as 25°N.


The authors thank Lin-Feng LI, Qi-Bao SUN, Yan LU, Xin-Xin TAN and Guang ZHOU for their help during the field trips for collecting Rhododendron simsii samples. This work was supported by the National Natural Science Foundation of China (30770168, 31100272) and the Research Fund for the Doctoral Program of Higher Education of China (20060558090).