In situ glacial survival maintains high genetic diversity of Mussaenda kwangtungensis on continental islands in subtropical China

Abstract Generally, island populations are predicted to have less genetic variation than their mainland relatives. However, a growing number of studies have nevertheless reported exceptions, indicating that the relationships were impacted by several factors, for example, historical processes. In the present study, we chose a group of subtropical islands located in South China as the study system, which are quite younger and much closer to the mainland than most of the previous studied island systems, to test the hypothesis that in situ glacial survival contributes to high levels of genetic diversity in island populations. We conducted a comparison of genetic variation between 12 island and 11 nearby mainland populations of Mussaenda kwangtungensis using eleven nuclear microsatellite and three chloroplast markers, evaluated effects of the island area and distance to mainland on genetic diversity of island populations, and simulated the potential distribution over the past by ecological niche modeling, together with the genetic data to detect the role of islands during the glacial periods. The island populations displayed comparable levels of genetic diversity and differentiation with mainland populations, overall high levels of unique polymorphisms, and the greatest values of specific within‐population genetic diversity. No significant correlation was detected between genetic diversity of island populations and distance to mainland, as well as area of islands, except that allelic richness was significantly positively correlated with the area of islands. Nuclear microsatellites revealed two main clusters, largely corresponding to islands and inland populations, which divergence dated to a time of island formation by ABC analysis. Ecological niche modeling predicted a highly climatic suitability on islands during the last glacial maximum (LGM). Our results suggest that the islands have acted as refugia during the LGM and highlight the role of in situ glacial survival in maintaining high levels of genetic diversity of M. kwangtungensis in continental islands of subtropical China.


| INTRODUC TI ON
Islands are often regarded as natural laboratories and play key roles in the study of ecology and evolution (Adsersen, 1995;Crawford & Stuessy, 1997;MacArthur & Wilson, 1967;Warren et al., 2015).
Indeed, since Darwin's time, island ecosystems have received a great deal of attention from ecologists and evolutionary biologists. Islands possess a series of unique characteristics; their small size, distinct boundaries, and simplified biotas mean that it is easier to observe and interpret patterns of evolution and population genetic structure in these settings (Losos & Richlefs, 2009). The natural geographical barriers of oceans also restrict gene flow between islands and the mainland or among islands; this provides opportunities to infer the genetic consequences of fragmentation or isolation as well as population differentiation (Liu, Zhang, Chen, Compton, & Chen, 2013;Wang et al., 2014).
We know that isolated populations of plant species are susceptible to environmental and demographic stochasticity (MacArthur & Wilson, 1967). As a result of this lack of connectivity, isolated island populations suffer from a multitude of demographic and genetic consequences, including a loss of genetic diversity (Frankham, 1997), increased inbreeding and likelihood of extinction (Lande, 1994). The loss of genetic diversity on islands can be primarily attributed to a small number of founders for population establishment at small sizes (Maki, 2001). Thus, assuming that species colonize to islands from the mainland under a stepping-stone model, less isolated, large, nearshore islands may serve as intermediate stepping stones for the colonization of more distant islands. Intense isolation effects can also be expected on distant islands; those that are situated close to a mainland source therefore have a greater potential for repeated colonization (Wallace, Wheeler, McGlaughlin, Bresowar, & Helenurm, 2017). This means that these areas have a chance of greater diversity (Kimura & Weiss, 1964). In general, genetic variation decreases negatively with effective population size (Wright, 1978); genetic diversity can therefore be predicted to fall much more rapidly on smaller islands because of their vulnerability to demographic stochasticity and drift (Wright, 1978). Limited gene flow impeded by geographical isolation in many cases can also lead to lower levels of island genetic variability (Frankham, 1997). Island populations also generally contain less genetic variation than their continental relatives. Although this hypothesis has been confirmed in several island systems (García-Verdugo et al., 2009;Matsumura, Yokoyama, Fukuda, & Maki, 2009), a growing number of studies have nevertheless reported exceptions. In one example, Chen, Shi, Ai, Gu, and Chen (2008) demonstrated similar genetic variation in island and mainland populations of Ficus pumila due to high gene flow. A similar pattern was also revealed in Cryptomeria japonica, attributed to islands acting as refugia for this species during the last glacial maximum (LGM) (Tsumura & Ohba, 1993). Additionally, Fernández-Mazuecos and Vargas (2011) detected more diversity in islands than in mainland populations of Cistus monspeliensis, something which can be explained by genetic bottlenecks on the continent during glaciations in the Quaternary. These case-specific studies indicate that a lower level of genetic variation on islands is not an absolute rule and that life-history traits, island age, isolation, and historical demography may also be significant predictors of this variable.
Historical processes, especially drastic Quaternary climatic fluctuations, have resulted in multiple contraction-expansion processes. These have long been considered to have exerted profound effects on the geographical distribution and current genetic structure of extant species (Hewitt, 2004;Hickerson et al., 2010;Qiu, Fu, & Comes, 2011;Shi, Michalski, Welk, Chen, & Durka, 2014).
Islands are often characterized by topographical and environmental heterogeneities and tend to have more stable climates relative to mainland areas (Weigelt, Jetz, & Kreft, 2013). This means that these regions have great potential to serve as efficient refugia for plant species during glacial periods. Genetic information also provides novel insights on past vegetation history and gives us the chance to address important paleoecological questions (Petit et al., 2003).
Accumulating genetic evidence has implied a role for refugia during glacial periods in different island systems, including the Canary Islands (Condamine, Leslie, & Antonelli, 2017;Fernández-Mazuecos & Vargas, 2011;García-Verdugo et al., 2013;Hutsemékers et al., 2011), Ryukyu Archipelago (Nakamura et al., 2014), and on Australian islands (Gibson et al., 2017). Long-term studies based on these oceanic islands have greatly advanced our understanding of complex biogeography. Understanding, however, remains limited with regard to younger, continental shelf islands as these have remained isolated since the submergence of Pleistocene land bridges and have therefore long been thought to possess relic biotas of limited evolutionary importance. Phylogeographic studies on continental islands seek explanations regarding the demographic history of populations (Nakamura et al., 2010) and therefore provide powerful tools to interpret current genetic structures versus the mainland.
China has more than 10,000 islands, including mostly continental examples (>90%) with climates ranging from tropical to cold temperate (Committee, 2013;Wang et al., 2014). Subtropical China (ca. 34°N to 22°N), characterized by evergreen broad-leaved forests (Wu, 1980), has never been covered by large ice sheets during the LGM. Climate here was cooler by 4-6°C at this time and markedly drier by 400-600 mm/year (Qiu et al., 2011). Sea level fluctuations in the Pleistocene resulted in recurring land bridges connecting the southern China mainland and continental islands, which may have provided suitable habitats for species to migrate and survive (Jiang, Gardner, Meng, Deng, & Xu, 2019). Simulated vegetation maps have indicated that land bridges in South China were occupied by tropical forests and nonforest when climate cooled during the LGM (Harrison, Yu, Takahara, & Prentice, 2001). To date, however, limited phylogeographic evidence reveals the evolution of species with disjunct distributions on islands and the mainland because of less emphasis and inadequate sampling, even though such studies in subtropical China have increased rapidly in recent years (Chen et al., 2020;Gong et al., 2016;Qiu et al., 2011;Shi et al., 2014).
Guangdong Province is located on the southernmost of Mainland China and boasts a 2,414 km coastline that includes the transition from the southern subtropical to tropical zone. This area encompasses ca. 750 continental islands (>500 m 2 ) with an area of about 1,592 km 2 (Li & Zhou, 2010). The bulk of these islands are close to the mainland (<30 km), especially those located off the Pearl River Mouth. The depth of the sea among islands ranges between 20 m and 30 m (Chen, Tong, & Mao, 1993). Indeed, during the LGM (13,700 years ago ± 600 years), sea level dropped to its lowest position, 131 m lower than current, forming a large area of land bridges. Sea level then rose rapidly in the Holocene, from −131 m to −40 m, reaching a postglacial maximum of between 3 m and 5 m higher than today around 6,000-7,000 years ago and fell to its present position until 5,000 years ago (Chen et al., 1993;Chen, Bao, Chen, & Zhao, 1990). Quaternary sporopollen assemblages show that this region was mainly covered by mixed northern subtropical evergreen and deciduous broad-leaved forests (Chen et al., 1993). These indicate that islands off the Pearl River Mouth served as glacial refugia during the LGM and there are chances of in situ glacial survival within extant island populations.
The climbing shrub Mussaenda kwangtungensis H. L. Li (Rubiaceae) is endemic to Guangdong Province and has a very narrow distribution.
This plant occurs only in the southern part of Guangdong Province, commonly on islands and the adjacent mainland. This species is placed within a very young genus which derived in the Miocene (7.82-14.44 Ma) (Duan et al., 2018) but that which then underwent rapid diversification from 4 Ma onwards. The clade containing M. kwangtungensis also has a very short evolutionary history (<1 Ma). We report here a comparison of genetic diversity between 12 island populations and 11 adjacent mainland populations of M. kwangtungensis carried out using 11 nuclear microsatellite and three chloroplast markers. Data enabled us to detect the impacts of isolation and demographic history on the genetic structure of M. kwangtungensis. Thus, by combining approaches from molecular phylogeography and ecological niche modeling, we tested the hypothesis that in situ glacial survival contributes to high levels of genetic diversity in island populations. We addressed the following ques- leaves with 3-9 cm × 1-3 cm in size and lanceolate-elliptic to elliptic-oblong in shape. Its inflorescences are compact cymoseto-subcapitate, and each possesses white, blade oblong-ovate, elliptic, or elliptic-ovate petaloid calycophylls with 3.5-5 cm long and 1.5-2.5 cm wide (Figure 1). The tubular flowers are yellow and 25-35 mm in length. This species flowers between May and September, with a peak between June and July; berry fruits mature between May and November (Luo et al., 2015). M. kwangtungensis is a typical functional dioecious species (Duan et al., 2018;Li, Wu, Zhang, & Barrett, 2010) which is pollinated by butterflies, moths, and bees (Luo et al., 2015). This species is endemic to  (Table 1), including 12 island populations from 12 islands, alongside eight inland and three peninsula populations. The area of selected islands varies from 1.27 to 157 km 2 (Committee, 2013), and the nearest distance between an island and the mainland fell within a range of 3.8 to 77.7 km (Table S1 in Appendix S1). In each population, between eight and 33 individuals (mean = 27) were sampled, a total of 617 individuals. Each sample was separated from others by at least 30 m in order to reflect the real pattern of the population as much as possible. About five healthy leaves were collected from each individual and dried using silica gel. The voucher specimen for each population was deposited in South China Botanical Garden, Chinese Academy of Sciences (Table S2: Appendix S1).

| Genetic diversity and differentiation
To address the effectiveness of the microsatellite loci, we performed three analyses. First, linkage disequilibrium among loci per population was tested using FSTAT 2.9.3 (Goudet, 1995). Second, population-specific tests of deviation from Hardy-Weinberg equilibrium were conducted with GenAlEx 6.5 (Peakall & Smouse, 2006).
Genetic diversity at both species and population levels was characterized by calculating the number of alleles (N A ), allelic richness (A R , correcting for sample size by rarefaction), and observed (H O ) and unbiased expected heterozygosity (H E ). Analyses were performed using FSTAT. Further, in order to investigate differences in genetic diversity between mainland (inland + peninsula) and island populations, we compared A R , H O, and H E between the two groups by permuting populations 1,000 times using FSTAT. In order to check the correlation of genetic diversity with island characteristics, we conducted linear regressions between genetic diversity of island populations and island characteristics (area and the nearest distance to mainland, both of which were transformed by natural logarithm to fit a normal distribution) using R 3.3.1 (R Development Core Team, 2012).
Overall and pairwise population differentiation was assessed by F ST (θ) using FSTAT. Standard error was estimated by Jackknifing for overall F ST , and 1,000 bootstraps were used to estimate the 95% confidence interval. As F ST is likely to underestimate genetic differentiation between populations for markers which show high levels of allelic variability, we also calculated F' ST , a standardized parameter of genetic differentiation as F' ST = F ST /F STmax (Hedrick, 2005). F I G U R E 2 (a) Genetic clusters of 23 Mussaenda kwangtungensis populations assigned by STRUCTURE. Each individual is represented by a vertical bar, partitioned into K = 2 and K = 3 clusters. Map showing the distribution and genetic structure of populations for the most likely K = 2 (b) and K = 3 (c). Colors of dashed ellipses correspond to the three groups classified according to where populations are located and the STRUCTURE clustering results when K = 3. Number in ellipses marks the corresponding gene pool used for ABC analysis F STmax was calculated after recoding the data using RECODEDATA (Meirmans, 2006). To test whether genetic differentiation was higher among island populations than that among mainland populations, differences among mean pairwise F ST values were evaluated using a randomization procedure with 1,000 permutations in FSTAT. To determine the level of genetic differentiation between the mainland and island populations, an analysis of molecular variation (AMOVA) was carried out using ARLEQUIN 3.5.2.2 (Excoffier & Lischer, 2010).
We also assessed patterns of isolation by distance using Mantel tests, based on the regression of pairwise estimates of genetic distances (F' ST /(1 − F' ST )) against corresponding logarithmic (log 10 ) geographical distances for all populations and two groups. These analyses were performed with the R package "ecodist" (Goslee & Urban, 2007) with 10,000 permutations to determine statistical significance.

| Genetic clustering
Population clusters and ancestry were detected by Bayesian cluster- which makes use of Markov chain Monte Carlo (MCMC) simulations to infer the proportion of ancestry of genotypes from K distinct clusters.
An admixture model was run with correlated allele frequencies. Each run was pursued for 500,000 MCMC interactions, with an initial burnin of 100,000. To estimate the K number of ancestral genetic populations and the ancestry membership coefficients of each individual in these clusters, the algorithm was run 10 times for each user-defined K value from 1 to 10. The log-likelihood value Ln P(K) and delta K (ΔK) were calculated to determine the most likely number of clusters, as suggested by Evanno, Regnaut, and Goudet (2005), using a webbased program STRUCTURE HARVESTER (Earl & vonHoldt, 2012).

| Demographic analysis
Demographic history of populations that had experienced a recent (past 2N e -4N e generations) reduction in effective population size was detected using BOTTLENECK based on microsatellite data (Piry, Luikart, & Cornuet, 1999), which assumes that alleles are generally reduced faster than heterozygosity; thus, populations experiencing recent bottlenecks will display an excess of heterozygosity relative to that expected from the equilibrium between mutation and drift (Piry et al., 1999). Two models were selected, a stepwise mutation model (SMM) and a two-phase model (TPM), believed to be appropriate for microsatellite evolution (Luikart & England, 1999). The latter, TPM, was set with 70% SMM characters, while significance was estimated by Wilcoxon sign-rank test based on 1,000 replications. We also tested the presence of mode-shifts from the normal L-shaped distribution of allelic frequencies, typical of an equilibrium situation in stable populations (Luikart, Sherwin, Steele, & Allendorf, 1998 (Cornuet et al., 2014). Since STRUCTURE identified K = 2 as the most appropriate number of clusters, followed by a second peak at K = 3, we defined three gene pools on the basis of these results. Thus, each population, except for the DA population, was assigned to one of the three gene pools corresponding to its largest proportion of ancestry  (Table 2, Fig. S1 in Appendix S2), we tested the following alternative hypotheses for M. kwangtungensis glacial refugia: the presence of just one refugium, located either in inland (scenario 1) or in island (scenario 2); the presence of two refugia located in inland and island, respectively (scenario 3-6); and the presence of three refugia, one located in inland and two in islands (scenario 7). Prior values used for all parameters are listed in Table S3 (Appendix S2).
Summary statistics were calculated for each scenario by selecting the scenario and estimating the posteriors of demographic parameters.
Seven million simulations were run in total, of which the 1% closest to the observed data were used to estimate the relative posterior probabilities of each scenario with a logistic regression approach. The relative likelihoods of the seven scenarios were compared by logistic regression on 1% of the closest simulated data points.

| Contemporary and historical gene flow
To evaluate the migration rates over contemporary and historical timescales, we used the programs BAYESASS (Wilson & Rannala, 2003) and MIGRATE (Beerli, 2008), respectively. Both analyses were conducted at regional scale based on the two clusters identified by STRUCTURE. In BAYESASS analysis, each run performed for 1 × 10 7 iterations with the chain sampled every 2,000 iterations. A burn-in of 10 6 was used, and delta values were adjusted to ensure that 40%-60% of the total changes were accepted. As recommended in a recent evaluation of BAYESASS (Faubet, Waples, & Gaggiotti, 2007), 10 multiple runs were performed using different initial seeds. We presented the results for the run displaying the best model fit as indicated by the Bayesian deviance measure (Spiegelhalter, 2002).
In terms of assessing historical migration rates, MIGRATE was run with subsets of the data (consisting of equal sizes of ~50) using default settings (10 short chains of 10 3 sampled, 500 recorded, and three long chains of 10 4 sampled, 5,000 recorded). Four-chain heating at temperatures of 1.0, 1.5, 3.0 and 100,000 was implemented to increase the efficiency of the MCMC. We repeated the runs three times until posterior probabilities stabilized using prior estimates of where μ is the mean mutation rate of microsatellites. The μ was set to 10 -4 -10 -3 , the same as that used in ABC analysis.

| Diversity and differentiation
Haplotypes were identified using DNASP 5.10 (Librado & Rozas, 2009). The phylogenetic relationships among haplotypes were visualized as a statistical parsimony network computed with TCS (Clement, Posada, & Crandall, 2000). We calculated haplotype diversity (h) and nucleotide diversity (π) per population using ARLEQUIN. The significance of differences in h and π between mainland and island populations was investigated by the t test at the 5% level (R Development Core Team, 2012). Correlations between diversity indices (no. of haplotypes, h and π for island populations) and island characteristics (natural logarithm of island area and the nearest distance to mainland) were tested using linear regression analyses performed with R 3.3.1. To access the genetic differentiation, we computed F ST values for all populations and both groups (mainland and island) identified by cpDNA sequencing using ARLEQUIN.
An analysis of molecular variation (AMOVA) was also carried out to determine the level of genetic differentiation between mainland and island populations in ARLEQUIN. We also calculated pairwise F ST values and tested for isolation by distance by evaluating the significance of the correlation between genetic differentiation and log geographical distances.

| Demographic history
We used values of Tajima's D and Fu's Fs to infer potential population growth and expansion using ARLEQUIN. It is clear that significantly negative D and Fs values often indicate recent population expansion following a sever reduction in population size (Fu, 1997). Mismatch distribution analysis was then also conducted under the model of demographic expansion. The goodness of fit was tested with the sum of squared deviation (SDD) between observed and expected mismatch distribution, and Harpending's raggedness index (H Rag ) using 1,000 parametric bootstrap replicates.
The expansion time (T) for expanding groups was calculated based on the formula T = τ/2μkg (Rogers & Harpending, 1992), where μ is the substitution rate in substitution/site/year (s/s/y), k is the average sequence length used for analysis (3,530 bp in this study), and g is the generation time in years. A value for μ was set as 1.52 × 10 -9 s/s/y, as proposed by Yamane, Yano, and Kawahara (2006) for noncoding chloroplast regions, with a confidence interval of 1.0-3.0 × 10 -9 s/s/y (Wolf, Li, & Sharp, 1987). The value of g was approximated as 10 years in shrubs. We also used the Bayesian skyline plots (BSPs) method as implemented in BEAST v.2.6.2 (Bouckaert et al., 2014) to estimate fluctuations in effective population size using the above substitution rate. MCMC chains were run for 2 × 10 7 under the GTR model selected by ModelFinder (Kalyaanamoorthy, Minh, Wong, von Haeseler, & Jermiin, 2017 Ancestral effective population size varied at t3. Gene pools 1 and 2 diverged at t2, and then gene pool 3 arose via divergence from 2 at t1 One in inland and One in island.

Scenario 4
Ancestral effective population size varied at t3. Gene pools 1 and 3 diverged at t2, and then gene pool 2 arose via divergence from 3 at t1 One in inland and One in island.
Scenario 5 Ancestral effective population size varied at t3. Gene pools 1 and 2 diverged at t2, and then gene pool 3 arose via an admixture event of parental demes 1 and 2 at t1 One in inland and One in island.

Scenario 6
Ancestral effective population size varied at t3. Gene pools 1 and 3 diverged at t2, and then gene pool 2 arose via an admixture event of parental demes 1 and 3 at t1 One in inland and One in island.

Three refugia
Scenario 7 Ancestral effective population size varied at t2, and gene pools 1, 2, and 3 diverged simultaneously at t1 One in inland, and two in islands TA B L E 2 Description of the seven scenarios in ABC analysis for divergence and admixture history of Mussaenda kwangtungensis

| Ecological niche modeling
To identify the potential species distribution during the LGM, we calibrated climatic envelope models using georeferenced native occurrence records for the species and the Maxent algorithm. Current presence records of M. kwangtungensis were collected from the herbarium records and our sampled populations (Table S2)

| Nuclear genetic diversity and differentiation
Comparing with an adjusted P value, no linkage disequilibrium was found in any population among loci ( Figure 4).

| Nuclear genetic structure
Population structure and ancestry were inferred in STRUCTURE.
The estimated logarithm of probability of the data, Ln P(K), increased slightly from K = 2, and the ΔK statistic was greatest for K = 2, with a second peak at K = 3 ( Figure S2). At K = 2, all populations were divided into two clusters, geographically corresponding to inland and island plus peninsular regions, respectively. In terms of K = 3, inland cluster remained consistent, while DX, DG, XC, and SC were further clearly separated from island cluster, and WS, XW, and BL showed a large amount of mixture among gene pools (Figure 2).

| Bottleneck and ABC analysis
Bottleneck analysis did not indicate excess of heterozygosity associated with recent bottlenecks in any mainland and island population, either under the SMM or TPM models (Table S5: Appendix S2).
However, a shift from the normal L-shaped distribution of allelic frequencies was found in the BL case.
A comparison of posterior probabilities of the seven scenarios using a logistic approach on 70,000 data points indicated that   (Table 4).

| Contemporary and historical gene flow
Based on BAYESASS analysis, the contemporary migration rate (

| CpDNA structure
By aligning, M. kwangtungensis sequences yielded lengths of 920 bp, 1,224 bp, and 1,386 bp for rpl32-trnL, psbE-petL, and ndhf-rpl32, respectively. A total of 28 variable sites, comprising 26 point mutations and two indels, were detected, defining 21 haplotypes (Table S6: Appendix S2). The geographical distribution of haplotypes is shown in Figure 5. The most widely distributed haplotype (H1) was found in 12 populations (52.2%) and 36 individuals, followed by H7 occurring in seven populations and 20 individuals. Six haplotypes (H1-3, H7, H8, and H12; 28.6%) were shared among mainland and island populations, while six (H4-6 and H9-11, 28.6%) and nine (H13-21, 42.9%) were privately restricted to mainland and islands, respectively. The parsimony network revealed closely related haplotypes (Figure 5b). The shared ones (H1, H7, and H3) were the most interior nodes and could be considered as ancestral. All the unique haplotypes in mainland and islands differed from them by one to three mutation steps, except for H11 by six steps from H3.

| Demographic history
At the species level, the neutrality test showed that both Tajima's D   (Table 5), and a slightly ascending curve followed by a fall for BSP (Fig. S4). The expansion event was dated at 17,000 (8,640 -25,900) years BP. No signature of demographic expansion was detected in the case of the inland cluster. Both D (1.032, p = .810) and Fs (1.723, p = .837) were positive. The mismatch distribution was bimodal with significant SSD (Table 5). BSP showed a very slightly ascending trend ( Figure S4).

| Ecological niche modeling
The prediction of the bioclimatically suitable areas for M. kwangtungensis during the LIG, the LGM, and current is shown in Figure 6. from the LIG (18,433 km 2 ) was a little larger than that during the LGM based on the CCSM model, but far smaller than that from LGM based on MIROC model. Data show that when climatic suitability was greater than 0.8, CCSM model revealed greater area Note: Goodness of fit of observed to theoretical mismatch distributions under a demographic expansion model is tested with the sum of squared deviations (SSD) and raggedness index (H Rag ). Confidence interval of expansion time T (in years BP) is shown in parentheses. NC, not calculated. in inland than on islands, while in MIROC model, more suitable habitats were located on islands.

| Similar levels of genetic diversity between mainland and island populations
Plastid and nuclear markers reveal high levels of genetic diversity in this species. Indeed, haplotype diversity based on cpDNA (h T = 0.855) was higher than the mean value of angiosperms (h T = 0.68) , while the expected heterozygosity computed on microsatellites (H E = 0.705) was higher than values reported for narrow-distributed/endemic species (H E = 0.56/0.42) (Nybom, 2004). Life-history traits, especially mating systems, may determine genetic structure among populations (Duminil et al., 2007). Species with mixed or outcrossing mating systems was demonstrated to have significantly higher genetic diversity than selfing species (Nybom, 2004). It is clear that M. kwangtungensis is a typical functional dioecious species with an obligate outcrossing mating system , thus considered capable of harboring high genetic diversity (Hamrick & Godt, 1996). This has been evidenced by many previous studies (Shi & Chen, 2012;Shi et al., 2014).
Island populations show comparable genetic variation with mainland populations (Table 3), inconsistent with the general expectation that these tend to have lower genetic variation due to founder effects and geographical isolation (Frankham, 1997). We discuss possible explanations for this as similar, or higher, levels of genetic variation on islands have been variably ascribed to a number of variables including high gene flow. In this first case, we know that M. kwangtungensis is pollinated predominantly by butterflies (Papilionidae: Papilio paris, Figure 1) (Luo et al., 2015), in which long-distance dispersals have been frequently observed (up to 13 km) (Baguette, 2003;Li, Zhang, Settele, Franzén, & Schweiger, 2013). Individuals of P. paris have been estimated to move up to 6 km (personal communication with Dr. X. Li), a distance that ensures pollen can be transferred between most islands and adjacent populations (Table S1). In addition, the color of berry fruits turns to purple black when they mature and attracts foraging birds (personal observation), which may provide opportunities for the long-distance dispersal of seeds. At the same time, however, the high levels of private haplotypes on islands ( Figure 5) also seem to indicate a lower level of occurrence while similarity in contemporary and historical gene flow indicate that isolation does not exert an influence on genetic structure (Chiucchi & Gibbs, 2010). Considering a relatively continuous distribution on the mainland, sufficient dispersal across the ocean barrier could be expected. Secondly, genetic bottlenecks on the mainland might provide another explanation as these have occurred in the recent past due to human disturbance.
The construction of buildings and roads, for example, has been reported in mainland populations, including for Ficus pumila (Chen et al., 2008). Bottleneck analysis here, however, did not show any evidence for a population bottleneck on the mainland, as well as on islands excepting for BL (Table S5), indicating a relatively stable effective population size, as revealed by neutrality tests and mismatch analysis based on cpDNA. Thirdly, multiple mainland-island colonization may have occurred given that species originated on the mainland and subsequently colonized islands. This would have led to a decrease in genetic diversity with increased distance to the mainland due to the founder effect under a stepping-stone colonization process (Yamada & Maki, 2012). However, our results did not indicate any correlation between any parameter of genetic diversity and distance to mainland. The haplotype network also did not reveal a clear divergence between mainland and islands; ancient haplotypes (H1, H3 and H7) were distributed in both regions ( Figure 5) and this does not support mainland-island colonization. Fourthly, consideration of islands as refugia during the LGM (Chen et al., 2008;Fernández-Mazuecos & Vargas, 2011;Tsumura & Ohba, 1993) can be based on relict populations frequently reported to preserve high genetic variations (García-Verdugo et al., 2013;Petit et al., 2003). In one example, Cryptomeria japonica maintained a relatively large genetic variation on Yaku Island compared with that of mainland Japan mainly because this island was refugium during the LGM and had a suitable climate (Tsumura & Ohba, 1993). In the case of the polyploid species Prunus lusitanica, island populations displayed more private alleles and higher genotypic diversity in old volcanic areas, suggesting the idea of oceanic islands as important refugia for biodiversity (García-Verdugo et al., 2013). Our data also supported the hypothesis of islands as refugia in a very young, continental island system (detailed discussion seen below).

| No effect of isolation on genetic diversity and differentiation
As proposed by the equilibrium theory of island biogeography (Losos, Ricklefs, & MacArthur, 2009;MacArthur & Wilson, 1967), larger islands can be expected to hold higher cumulative levels of diversity because of larger numbers and sizes of populations. This means that levels of genetic diversity on islands should correlate with island area (McGlaughlin et al., 2014). However, we found that island area was indicating that the correlation between genetic diversity and island area is not a common pattern. One possible explanation is that although a larger island is capable of sustaining more populations of a species, area should not be taken as an accurate surrogate for effective population size as this will be more significant determinant of levels of genetic diversity than number of populations (García-Verdugo et al., 2013;McGlaughlin et al., 2014). The prediction of a positive relationship between genetic diversity and island area assumes that population densities for a species are kept the same across different islands, something that is actually unlikely to be the case. Indeed, for M. kwangtungensis, XX island with the smallest area was observed to have an obviously higher population density, compared to other larger islands (personal observation).
A negative relationship between distance to mainland and genetic diversity is predicted, mainly due to founder effects and limited gene flow. Although this was also seen in two previous studies on Izu Islands of Japan (Inoue & Kawahara, 1990;Yamada & Maki, 2012), we failed to establish a significant relationship in the case of M. kwangtungensis. The most likely explanation for this result is that island systems had different histories of formation and species colonization. In the case of the Izu Islands, these are distributed as a roughly linear chain extending from north-to-south near Mainland Japan and are thought to be formed southwards during the late Pleistocene through submarine volcanism (Kaneoka, Ishiki, & Zashu, 1970;Yamada & Maki, 2012), probably leading to a southward stepping stone-like colonization from the mainland for Weigela coraeensis (Yamada & Maki, 2012) and Campanula punctate (Inoue & Kawahara, 1990). The islands studied here, in contrast, are distributed along the coast of the South China Mainland, which were connected during the LGM and were formed by rising sea levels around 7,000 years ago (Chen et al., 1993). Indeed, M. kwangtungensis is thought to have survived long term in these islands based on genetic and ecological niche modeling ( Figure 6, see discussion below), rather than as a colonization/recolonization from a mainland source after the ice age.
Restricted land masses of islands with the geographical barrier of the sea are inclined to promote genetic differentiation among islands, and between the mainland and island populations (Frankham, 1997).
Our results revealed statistically nonsignificant differences between genetic differentiations of mainland and islands at microsatellites ( Figure 4, Table 3) as well as low levels of genetic differentiation between them, as indicated by the results of AMOVA for both genetic markers (Table S4). These results may be attributable to potentially high rates of gene flow (see discussion above) as well as the geographical proximity of these areas. The islands assessed here are closely distributed along the coastal line with distances to mainland ranging from 3.8 to 77.7 km and with nearest neighbors from 0.7 to 19.6 km. Thus, SM, DJ, and XX islands are within a range <6 km from the nearest peninsula (Table S1); this close proximity is likely to give rise to a high rate of gene flow, as suggested by patterns of isolation by distance (Figure 4), and the STRUCTURE analyses based on microsatellite data, where these three islands with peninsular populations (QN, YM, and DX) fell into the same cluster ( Figure 2).

| In situ glacial survival on islands
Consistent with simulated vegetation maps (Harrison et al., 2001) and sporopollen evidence (Chen et al., 1993) that a land bridge in South China provided refugia for plant species during the LGM, our findings also suggest in situ glacial survival of M. kwangtungensis on islands. In the first place, ENM indicates a large area with highly climatic suitability was present in regions of this land bridge during the LGM (Figure 6). Although CCSM predicted relatively limited region compared with MIROC, both verified highly suitable habitats (>0.8) on islands, implying that the current populations are partly representative of the past. Secondly, molecular evidence revealed high genetic diversity and, especially, numbers of unique haplotypes within island populations ( Figure 5, Table 1). In one example, BL showed the highest level of expected heterozygosity, with a minimum sample size of eight, while XW possessed the most haplotypes and, XC and WS had the maximum haplotype diversity (Table 1).
Thirdly, STRUCTURE analysis based on microsatellite data revealed two clearly genetic clusters largely separating island populations from their mainland counterparts (Figure 2 It is the case that in situ glacial refugia have been frequently reported for temperate (Parisod & Besnard, 2007;Zeng et al., 2015) and evergreen broad-leaved forests (Chen et al., 2020;Gong et al., 2016;Wang et al., 2015), and mountainous regions have usually been regarded as critical glacial refugia because their high topographical heterogeneity offers scope for shifts in altitudinal range in response to climate changes (Shi et al., 2014). Islands, due to their stable climate and environmental heterogeneities, also have the potentials for in situ glacial survival of species, as evidenced by accumulating studies (García-Verdugo et al., 2013Hutsemékers et al., 2011). In South China, land bridge, connecting between continental islands and mainland during glacial periods, may provide suitable habitats with a minor shift of temperature and humidity Weigelt et al., 2013), and have supported effective population sizes for in situ persistence, as indicated in Hainan Island (Chang et al., 2012) and Taiwan Island (Cheng, Hwang, & Lin, 2005).

| Demographic history
Typical responses of plants to climate changes include geographical migration, resulting in the alteration of distribution range (Etterson & Shaw, 2001). The current distribution of M. kwangtungensis is mainly restricted within a narrow belt in southern Guangdong Province. Based on ENM reconstruction, the possible occurrence of populations during the LGM mainly covered a greater area extending current distribution to southern land bridges ( Figure 6), indicating much more suitable habitats for survival. Thereby, the growth of population size is considered to be expected. Our genetic evidence supported this idea: significantly negative D and Fs, unimodality of mismatch distribution (Table 5), and a slightly ascending curve for BSP ( Figure S4) for all populations and island cluster were evident for population expansion (10,700-32,200 years BP), while inland populations kept a relatively stable population size. It is plausible that when climate cooling, land bridge connected islands to mainland, facilitating island populations expanding in space and population size, which may appear to be an unusual scenario, different from the general pattern of demographic contraction in glacial periods, but population growth in response to glaciation-induced increase in lowland area has recently been suggested for wide variety of species Nakamura et al., 2014).
ABC analysis indicated the highest support for the hypothesis that there was one refugium in islands and inland populations were derived from island at around 3,930 years ago, when considering a generation time of 10 years for M. kwangtungensis (scenario 2) (Table 2, Figure S3). This scenario should be treated with caution given the overlapping confidence intervals with scenario 1 (one refugia in inland and island derived from inland). Since hypothesis of more than two refugia was rejected ( Table 2,

| CON CLUS IONS
The molecular data presented here for M. kwangtungensis, an endemic species to South China with a narrow distribution, revealed a similar level of genetic diversity in islands compared with the mainland, contradictory with the general expectation that insular populations often harbor lower genetic variation due to founder effects and geographical isolation (Frankham, 1997). The genetic diversity pattern of this species may be explained by close proximity and a potentially high rate of gene flow among populations as well as by historical processes. It is possible that current islands acted as refugia during the LGM and that extant island populations experienced in situ glacial survival, confirmed by differentiated lineages and high levels of polymorphism, together with the strong support of a suitable climate. Our results highlight a key role of in situ glacial survival in maintaining high levels of genetic diversity in continental islands across southern subtropical China. This result means that these islands provide a model system for future studies to assess dispersal, divergence, and the ecological adaptation of insular species in a young and adjacent mainland-island system.

ACK N OWLED G M ENTS
We thank Kai Jiang for assistance in data analysis and discussion, and Ziwei Wang and Lianxuan Zhou for assistance in sample

DATA AVA I L A B I L I T Y S TAT E M E N T
Haplotype sequences were deposited in GenBank under the accession numbers MT349425-MT349442, and MT361334-MT361343.