Evolutionary legacy of a forest plantation tree species (Pinus armandii): Implications for widespread afforestation

Abstract Many natural systems are subject to profound and persistent anthropogenic influence. Human‐induced gene movement through afforestation and the selective transportation of genotypes might enhance the potential for intraspecific hybridization, which could lead to outbreeding depression. However, the evolutionary legacy of afforestation on the spatial genetic structure of forest tree species has barely been investigated. To do this properly, the effects of anthropogenic and natural processes must be examined simultaneously. A multidisciplinary approach, integrating phylogeography, population genetics, species distribution modeling, and niche divergence would permit evaluation of potential anthropogenic impacts, such as mass planting near‐native material. Here, these approaches were applied to Pinus armandii, a Chinese endemic coniferous tree species, that has been mass planted across its native range. Population genetic analyses showed that natural populations of P. armandii comprised three lineages that diverged around the late Miocene, during a period of massive uplifts of the Hengduan Mountains, and intensification of Asian Summer Monsoon. Only limited gene flow was detected between lineages, indicating that each largely maintained is genetic integrity. Moreover, most or all planted populations were found to have been sourced within the same region, minimizing disruption of large‐scale spatial genetic structure within P. armandii. This might be because each of the three lineages had a distinct climatic niche, according to ecological niche modeling and niche divergence tests. The current study provides empirical genetic and ecological evidence for the site‐species matching principle in forestry and will be useful to manage restoration efforts by identifying suitable areas and climates for introducing and planting new forests. Our results also highlight the urgent need to evaluate the genetic impacts of large‐scale afforestation in other native tree species.


| INTRODUC TI ON
The rate of deforestation is rapidly increasing all over the world (Hansen et al., 2013) and will result in unprecedented biodiversity losses (Barlow et al., 2016) and consequent reduction of ecosystem functions. As a potential solution, afforestation has been widely adopted by many countries around the world, for climate change mitigation, protection of natural forests, replacement of lost tree cover, facilitation of natural regeneration, and provision of forest products (Bastin et al., 2019;Canadell & Raupach, 2008;Carnevale & Montagnini, 2002). The scale of afforestation has rapidly increased in the past decades, with plantations covering 277.9 million ha, and accounting for 6.95% of global forest area by 2015 (Payn et al., 2015). China has been leading global afforestation efforts, with 78.9 million ha of planted forest area, which is 28% of the global total and by far the most in the world (Payn et al., 2015). This extensive afforestation was supported by several Key Forestry Programs (SFAPRC, 2014). In particular, the Grain for Green Project (GFGP) had re-established 28.20 million ha of forest in 25 of China's 31 mainland provinces by 2013 (SFAPRC, 2014), making it currently the largest revegetation program conducted anywhere (Hua et al., 2016).
The ecological impacts of large-scale afforestation have attracted much attention (Peng et al., 2014). One issue is lack of species diversity in plantings (Hua et al., 2016); for example, in China, most forests planted are monocultures, and just ten species (e.g., Cunninghamia lanceolata, Larix gmelinii, Pinus massoniana, P. tabuleaformis, Cupressus funebris) account for 73% of total plantation area (SFAPRC, 2014). Planting any exotic species entails a small but significant risk that it might become invasive, if they are able to invade habitats and attain higher fitness than native species (Knowler & Barbier, 2005;Schutzenhofer, Valone, & Knight, 2009). This can be avoided by planting within a species' native range, but that creates another concern that has received far less attention: the genetic effects of large-scale plantings on native populations (Laikre, Schwartz, Waples, & Ryman, 2010). If introduced genotypes or alleles have a fitness advantage and/or they outnumber natives, the natural populations may be threatened by genetic swamping (Anttila, King, Ferris, Ayres, & Strong, 2000;Hufford & Mazer, 2003), or if the advantage is large, simple replacement by a more competitive genotype (Bayms, 2008). Furthermore, gene flow into native populations, especially if all plantings come from a common source, could lead to genetic homogenization across large parts of the planted species' natural range (Olden, Poff, Douglas, Douglas, & Fausch, 2004). Nonlocal exotic plantations have significantly affected the genetic composition of the offspring of nearby conspecific populations in several cases (Unger, Heuertz, Vendramin, & Robledo-Arnuncio, 2016), with proportions of introgressed offspring exceeding 40% in some cases (Steinitz, Robledo-Arnuncio, & Nathan, 2012). However, the long-term genetic consequences of large-scale plantings, across the full range of a species, have not yet been properly investigated (Steinitz et al., 2012). The first stage of any such investigation would be to determine whether planted material is sourced from other regions.
If that is the case, it would facilitate between-region gene flow that would not otherwise happen naturally.
Examining the full effect of human-induced intraspecific gene flow requires examination of spatial-temporal genetic variation across a species' range (Avise, 2000) and is best conducted in concert with examination of naturally occurring genetic variation. Historical events such as orogenies and monsoon development, and climatic oscillations such as glacial cycles, affect geographic distribution of genetic variation among populations (Antonelli et al., 2018;Avise, 2000). Sometimes allopatric divergence and speciation may ensue (Avise, 2000;, and speciation events are commonly associated with adaptation to different local or regional environments (Maria et al., 2015). For dominant forest trees, large effective population sizes and long generation times can influence and obscure the genetic and phenotypic signals left during the diversification process Steinitz et al., 2012). Where large areas have been afforested, as noted above for China, this will complicate the signature of past biogeographic events and vice versa; hence, to fully examine any of these processes, all should be studied together (Ortego, Noguerales, Gugger, & Sork, 2015).
Pinus armandii is an evergreen montane coniferous tree species, whose natural range is concentrated in central and southwestern China (Farjon & Filer, 2013;Ma, 1989). Parts of its range have a dynamic geo-climatic history (Clift & Webb, 2019;Wang et al., 2012), leading to a complex biogeographic history, with extreme environmental heterogeneity and diverse historical components that have helped shape one of the richest floras on the planet (Qian & Ricklefs, 2000). The species has a long history of felling for timber and industrial raw materials (Ma, 1992), leading to widespread deforestation and even serious ecological degradation in southwest China.
To alleviate this phenomenon and protect regional water resources, the Chinese government has implemented large-scale afforestation using P. armandii since the late 1950s (Ma, 1992), although success rates before 1980 were limited (Ma, 1992;Wang & Hong, 2004).
A total of area 162,573 ha was planted with P. armandii between 1999 and 2010 under the GFGP, especially in the barren mountains of southwestern China (Yao et al., 2014), using a mixture of aerial seeding, artificial seeding, and seedling transplantation. Since 1980, provenance studies have been used to improve the adaptability and survival rate of planted material in China (Ma, 1989(Ma, , 1992, but little is known about how this rapid anthropogenic increase in its numbers has affected the range-wide genetic structure of P. armandii.
This rapid mass afforestation, often placing planted trees in the vicinity of native populations, forms a large-scale experiment that provides an opportunity for studying the risk of genetic homogenization and other potential effects of planted to native gene flow within this species. Previous molecular analyses detected clear genetic divergence between northern and southern populations of P.
armandii Liu, Jin, Wei, & Wang, 2019), but beyond this, the details of spatial-temporal intraspecific differentiation remain unclear, and the genetic effects of afforestation are therefore unknown. Thus, P. armandii represents an excellent study system with which to simultaneously examine the effects of both natural historical events and recent anthropogenic mass afforestation on a major forest tree species.
In this study, we sought to tease apart the evolutionary legacy of P. armandii from that of mass planting, by contrasting phylogeographical signals from natural and planted populations. To this end, variation was examined across multiple microsatellite markers, plus chloroplast, mitochondrial, and nuclear DNA fragments, backed up by full chloroplast genome sequencing of selected individuals. We analyzed the patterns of genetic differentiation within P. armandii and employed a robust niche dynamics framework to compare climatic niche between intraspecific lineages. From this, we inferred divergence pathways, tested for environmental niche differentiation, and assessed the effects of forest plantations on the genetic composition of P. armandii natural populations. The results will serve as an important basis for future genetic monitoring in large-scale afforestation initiatives and as a case study for the effect of mass afforestation within a species' native range.

| Sampling area and plant material
Material of P. armandii was sampled from 41 natural populations from the species' full range across central and southwestern China, plus 11 planted populations from Yunnan, Shaanxi, Gansu, Hubei, and Zhejiang provinces ( Figure 1; Table S1). Because plantings before 1980 were largely unsuccessful (Wang & Hong, 2004), all planted populations we sampled were planted in the 1980s, on sites that were previously barren or farmland with no P. armandii present, according to local information and references (see Table S1). From all of these, a total of 696 mature trees were sampled that were between 25 and 90 years old, taking pine needles from each for DNA extraction and further molecular analysis ( Figure 1; Table S1), that is, DNA sequencing, microsatellite genotyping, and chloroplast genome sequencing.
Total genomic DNA was isolated using plant genomic DNA kits (Tiangen, Beijing, China). The genetic variation and structure of P.
armandii were estimated based on a large-scale population genetic dataset, comprising markers with both biparental and uniparental inheritance modes. Using twelve pairs of microsatellite primers, twelve nSSR regions were successfully amplified from all 696 sampled individuals (Table S2). Forward microsatellite primers were 5′-end fluorescently labeled using either FAM or TAMRA (Applied Biosystems). PCR fragments were separated on an ABI 3730 xl DNA Sequencer and individually assessed using GENEMAPPER v4.0 (Applied Biosystems). In addition, following the protocols of Jia et al. (2018), one chloroplast (cpDNA) fragment, ycf1, and two nuclear fragments, AGP6 and LFY, were successfully amplified and sequenced for 466, 201, and 44 individuals, respectively. Following on from STRUCTURE results that separated out three geographic lineages (Figure 2), we selected between two and five individuals from each lineage, making 12 in total, for chloroplast genome sequencing; these together covered the full geographic range of P. armandii (Table S3). Additionally, sequences from two mitochondrial DNA (mtDNA) fragments, nad5 intron 1 and nad7 intron 1, were obtained for 193 individuals, from Li et al. (2015). Therefore, we were able to examine variation in P. armandii based on markers from three different genomes and hence with different inheritance modes.
Divergence time was estimated with BEAST v1.7.5 (Drummond, Suchard, Xie, & Rambaut, 2012), using 12 published Pinus chloroplast genomes together with 12 newly sequenced individuals of P. armandii (Table S3). Due to the lack of credible fossils from P. F I G U R E 1 Geographic distribution and network of the chloroplast (cp) DNA haplotypes (H1-H6) detected in P. armandii. The purple and black circles represent plantation and wild populations, respectively armandii, we applied a crown node age of Pinus, that is, the divergence of the two subgenera Pinus and Strobus, of 85 million years. This followed Willyard, Syring, Gernandt, Liston, and Cronn (2007), who applied this date based on silicified fossil wood of subgenus Strobus from Late Cretaceous (85.8-83.5 Mya;Meijer, 2000). Convergence was checked using TRACER v1.5 (Rambaut & Drummond, 2009).

| Microsatellite data analysis
Hardy-Weinberg equilibrium and linkage disequilibrium were tested for using FSTAT v2.9.3 (Goudet, 2001 (Peakall & Smouse, 2006). Significant differences between wild and planted populations were quantified using Kruskal-Wallis tests. We also performed AMOVA with all native P. armandii samples pooled, and three specific analyses considering separately each of the three regional groups that were identified for the analysis in ARLEQUIN v3.5 (Excoffier & Lischer, 2010). Pairwise F ST was used to assess population differentiation within and among regions and populations (Excoffier, Smouse, & Quattro, 1992). F I G U R E 2 Structure analysis results and resultant map of genetic composition of each population in P. armandii. The K = 2 (a) and K = 3 (b) clusters are shown. For each K value, results of the run with the highest value of LnPD were used. The purple and black circles represent plantation and wild populations, respectively The Bayesian model-based clustering software STRUCTURE v2.2.3 (Pritchard, Stephens, & Donnelly, 2000) was used to infer distinct gene pools using combined microsatellite, AGP6 and LFY data, for wild populations alone, planted material alone, and the full dataset. A rarefaction microsatellite dataset which include same number of individuals as nuclear gene dataset was also analyzed in STRUCTURE. The analyses were run using the admixture model with correlated allele frequencies. The optimal number of clusters was determined by calculating DeltaK (ΔK) (Evanno, Regnaut, & Goudet, 2005) using STRUCTURE HARVESTER (Earl & VonHoldt, 2012).

| Genetic migration analyses
In order to examine historical genetic migration between regions, based on the microsatellite datasets, we used the coalescent-based program MIGRATE-N v3.6 (Beerli, 2006) to generate pairwise estimates of migration rates (Nm) between the three identified regional groups. To assess patterns of recent migration, we also estimated interpopulation migration rates (within 2-3 generations) using a Bayesian approach in BAYESASS v3.0 (Wilson & Rannala, 2003).

| Lineage divergence and demographic history
To estimate plausible scenarios of divergence and population dynamics within P. armandii, approximate Bayesian computation (ABC) was performed in DIYABC v2.04 (Cornuet et al., 2014). This analysis treated as separate three regional subgroups or lineages (East Himalaya: EH, South Hengduan Mountains: SH, and Qinling-Daba Mountains: QD) that were clearly identified based on STRUCTURE and phylogenetic results (Figures 2, 4). Four historical population divergence scenarios for these lineages were compared by DIYABC analysis: scenarios 1, 2, and 3 differed in that the first diverging lineage was EH, SH, or QD, respectively, whereas under scenario 4, the three lineages diverged simultaneously. We assumed uniform priors on all parameters and used a goodness-of-fit test to check the priors of all parameters before implementing the simulations ( Figure 3; Table S4). To estimate the divergence times among the three lineages, the average generation time of P. armandii was assumed to be 25 years, following Ma, Szmidt, and Wang (2006).
In addition, DIYABC was used to simulate and examine population demographic changes in the recent past. We separately tested the following four scenarios of demographic changes for each of the three lineages: continuous shrinkage, continuous expansion, shrinkage-expansion, and expansion-shrinkage. DIYABC allows selection F I G U R E 3 (a) The four scenarios for the population history of the three lineages (EH, SH, and QD) in DIYABC. (b) Schematic representation of four demographic models of changes in population size tested within the three lineages (EH, SH, and QD) in P. armandii of the demographic scenario that best fits the data and parameters of interest (Cornuet et al., 2014).

| Ecological niche modeling
Only wild occurrences have been taken into account for this and all subsequent analyses. Ecological niche modeling (ENM) analyses were performed with MAXENT v3.3.3k (Phillips, Anderson, & Schapire, 2006) to assess the ecological niche of each lineage and to predict their potential range based on their georeferenced localities and environmental variables thereof. The occurrence data of P.
Nineteen bioclimatic variables were acquired at 2.5 arc-minute resolutions from WorldClim (www.world clim.org) (Hijmans, Cameron, Parra, Jones, & Jarvis, 2005) for three periods: the present, the last glacial maximum (LGM, 18-21 ka), and the last interglacial period (LIG, 120-140 ka). To avoid model overfitting linked to correlated climatic parameters, only those seven variables that had low correlation coefficients with one another (r < 0.8) were retained for subsequent analysis (Table S5).

| Niche comparison analyses on G-spaces
To capture ecological differences in the niche occupied by each genetic lineage, likely reflecting local adaptation, niches of different lineages of P. armandii were compared in both geographic (G) and the environmental (E) spaces, because these two types of niche space have been shown to complement each other in niche comparison studies . For each lineage, historical niche shifts in geographic distribution between the LIG, LGM, and present day were inferred, based on ENMs in geographic (G) space and using MAXENT with default settings. We limited our model extent to the distributional range of each regional lineage of P. armandii with a 200 km buffered zone, to eliminate the impact of background geographic area of the models on modeling results (Merow, Smith, & Silander, 2013).
Furthermore, to measure niche differences between lineages, we used ENMTOOLS v1.3 (Warren, Glor, & Turelli, 2008) to calculate the niche overlap statistic Schoener's D (Schoener, 1968) and standardized Hellinger distance (calculated as I; Warren et al., 2008), where a value of 0 denotes no overlap and 1 indicates complete overlap. To test the null hypothesis that two lineages have identical ENMs, we used the niche equivalency test initially proposed by Warren et al. (2008) in ENMTOOLS. This test compares the observed scores of niche overlap statistics D and I with their null distribution generated with 100 pseudoreplicates (see Warren et al. (2008) for details).

| Niche comparison analyses on E-spaces
To assess the degree of niche overlap, we first assessed ENMs in spatial environmental (E) space using R packages (ECOSPAT, Di Cola et al., 2017). Following the approach initially proposed by Broennimann et al. (2012), principal components analysis (PCA) was used to translate occurrence and climate data into environmental axes (PCA-env).
Densities of points in multidimensional E space were then used to quantify ENM overlap, using the D and I statistics. The niche equivalency test was employed to test whether the environmental niche space of two lineages is identical using 100 pseudoreplicates.
Niche overlap between lineages can be characterized by niche unfilling, niche stability, and niche expansion Guisan, Petitpierre, Broennimann, Daehler, & Kueffer, 2014;Petitpierre et al., 2012). Lineage A occupies a range termed A, and the assumption is made that lineage B has diverged from it and now occupies a range B. The term "unfilling" describes conditions within range A that do not overlap range B, whereas the term "niche stability" covers any areas of shared range between A and B, and "range expansion" describes those parts of range B that do not overlap with A. This classification provides additional information about the drivers of the niche dynamic between ranges (Di Cola et al., 2017;Petitpierre et al., 2012).

| CpDNA variation
The cpDNA ycf1 fragment was successfully sequenced for all 466 sampled individuals of P. armandii, and a total of six chlorotypes were identified. Of these, three were common, with H1, H2, and H4 dominating populations from East Himalaya (EH), South Hengduan Mountains (SH), and Qinling-Daba Mountains (QD), respectively; hence, these three regions were clearly defined as distinct by chlorotype data (Figure 1). H1 and H2 also occasionally occurred outside their dominant regions, whereas H4 was unique to QD ( Figure 1).
All subsequent statements in this section refer to wild (not planted) material only, unless stated otherwise.

| MtDNA variation
The concatenated mt DNA sequences of nad5 intron 1 and nad7 intron 1 comprised 1412 bp, and from this, four mitotypes were distinguished. The geographic distribution of these was highly structured, with M1, M2, M3, and M4 unique to QD, EH, east SH, and west SH, respectively ( Figure S1).

| Population genetic differentiation and structure
The AMOVAs revealed that 88.90% of overall cpDNA variation was between regions (QD, EH, and SH; Table 1). The coefficient TA B L E 1 Hierarchical analyses of molecular variance (AMOVA) of P. armandii wild populations based on nuclear microsatellite, cpDNA, nDNA, and mtDNA genetic data For the Bayesian analysis of population structure for wild material (AGP6 and LFY genes), ΔK indicated that the optimal value for K was 2 ( Figure S2). At K = 2, one cluster comprised all EH and SH material plus some from QD, while the other comprised only material from QD ( Figure S2). However, hierarchical analyses within these clusters detected no further genetic subdivision ( Figure S3).
Meanwhile, comparing with the full microsatellite dataset, a similar genetic pattern was obtained with rarefaction microsatellite dataset ( Figure S4).

TA B L E 1 (Continued)
(there are no planted populations in EH) ( Figure S6a, d). All analyses revealed considerable rates of genetic admixture within each region, including between wild and planted populations, but not between regions ( Figure 2 and Figures S3, S5, S6).

| Genetic diversity within P. armandii
Overall cpDNA diversity of the wild P. armandii populations was high (H T = 0.614), whereas the same measure within each region was lower (Hs = 0.037, 0.133, and 0.050 for EH, SH, and QD, respectively) (Table S6). For planted populations, the H T and Hs values were 0.610 and 0.119, respectively (Table S6). Meanwhile, the total haplotype diversity (H d ) and nucleotide diversity (π) values for AGP6 and LFY among wild populations were higher than those for planted populations (Table S7).
All microsatellite markers were highly polymorphic within P.
armandii (Table S1) (Table S8), but based on all evidence available, overall variation was probably higher in the wild populations. On average, QD was more polymorphic than the other two regions.

| Genetic migration among groups
The BAYESASS analysis detected recent genetic migration from EH to SH (m = 0.024) and from SH to QD (m = 0.021) (Table S9).
Additionally, Migrate-n identified historical asymmetric gene flow

| Lineage divergence time based on chloroplast genome
A molecular phylogenetic tree was constructed using completed chloroplast genome sequences from the three lineages within P. armandii and outgroup species (Figure 4). The divergence time of P.

| Ecological niche modeling
The predicted model for the present potential range of P. armandii generated with MAXENT was fairly congruent with the current distribution of the species ( Figure S7), but this was more accurately predicted when considering the three lineages separately, with AUC (area under the curve) values ≥0.9 ( Figure S8). Comparing with its current range, the areas of suitable habitat in the EH region, QD, and adjacent areas were much wider during the LGM.
Conversely, the species' range was more restricted during the LIG than at present ( Figure S7). Similar patterns were obtained when each regional lineage was examined separately, with each having the largest range during the LGM and the smallest during the LIG ( Figure S8).
Niche overlap statistics demonstrated that each lineage occupied a distinct niche (Figure 5a) based on G space. ENMTOOLS showed that empirically observed values for I and D were significantly lower than those expected from pseudoreplicated datasets in all paired analyses (EH vs. SH, EH vs. QD, and SH vs. QD, p < .01) ( Figure 5a). However, no such regional difference was significant (p > .05, Figure S9a Concerning climatic niche partitioning across the three comparisons between regional lineages' climatic niches, "unfilling" niches account for 44.0%-80.8%, whereas "stable" niches (shared ranges between lineages) account for 24.2-88.6%, and "range expansion" niches account for 11.4-75.8% (Figure 5c1-c3; Table S11). The proportion of F I G U R E 4 Phylogenetic relationships and divergence times of P. armandii based on BEAST analysis. Blue bars and the numbers below the bars indicate 95% highest posterior densities of divergence times (Ma). Posterior probabilities are labeled on each node. Red, yellow, and blue branches represent EH, SH, and QD lineages, respectively "unfilling" niches was much higher for between EH and SH (80.8%), and between SH and QD (70.7%), than between EH and QD (40%) (Table S11).

| Discussion
In this study, we employed an integrative approach to address the evolutionary legacy of widespread afforestation of Pinus armandii within its native range, in China, using large-scale population genetics and phylogenomic analysis. We identified three intraspecific lineages, each occupying distinct ecological niches ( Figure 5). These diverged around the late Miocene (Figure 4) during a period of massive uplifts of the Hengduan Mountains and intensification of Asian Summer Monsoon (Clift & Webb, 2019;Wang et al., 2012). The oldest successful plantings of P. armandii date from the 1980s, and all plantations examined were probably sourced from the same region, as had been indicated for those planted populations previously examined (Ma, 1989). Therefore, planted material presents little or no probability of allowing between-region gene flow.

| Lineages within P. armandii and their biogeographic history
Despite widespread human influences, native forests might retain genetic signals from their past distribution, due to natural regeneration of local stock in combination with the long life of individual trees (Petit & Hampe, 2006), thereby allowing evolutionary backgrounds of natural ranges to be detected. Our data separated three geographic lineages within P. armandii (Figures 1, 2, 4; Figure S7).
Some admixture, and hence limited gene flow between regions EH and SH, was indicated by the presence in populations p6 (EH), p13 and p14 (SH) of the common haplotype from the other region. At the boundary between these regions, ecological conditions might form a gradient facilitating gene flow, whereas anthropogenic disturbances like deforestation might have promoted it, for example, F I G U R E 5 (a1-a3) Niche overlaps of P. armandii based on pairwise comparisons among the three lineages across climatic space. For each analysis, the lineages in red and green are lineages A and B in the analysis, respectively, with overlapping densities between ranges shown in violet. The solid and dashed contour lines delimit the 100th and 75th quantiles, respectively, of the density at the available climate. (b1-b3) Densities of available climates and P. armandii occurrences based on pairwise comparisons; the horizontal bars show the components of niche dynamics present along the x-axis: unfilling (green), stability (violet), and expansion (red). The solid arrows represent the shift direction of the niche centroid between the designated lineages A and B, and the dashed arrows represent the shift direction of the average available environmental conditions between ranges. (c1-c3) Niche equivalency test for each comparison based on Schoener's D statistic (Schoener, 1968), Warren's I statistic (Warren et al., 2008), and Maxent predictions. Bars indicate the null distributions of D and I where regeneration occurs but some distance away from surviving native populations. Furthermore, population p30, which is a southwestern geographic outlier within QD, contains mainly haplotype H1 from EH. There were also minor discrepancies between groupings based on microsatellite loci and those indicated by STRUCTURE analysis of nuclear genes (AGP6 and LFY) (Figure 2; Figure S2), which may be caused by the two types of genetic markers having different demographic histories and mutation rates (Petit, Duminil, Fineschi, Hampe, & Vendramin, 2005).
These three ecogeographic lineages had been suggested by previous molecular data , but the greatly increased sampling and genomic coverage of the current study clearly resolved their full ranges for the first time, and also their relationships.
Where mountain building and local climate change happen in concert, this will accelerate niche divergence (Antonelli et al., 2018). Niche divergence was proposed as the causes for divergence within Taxus wallichiana  and Roscoea (Zhao, Gugger, Xia, & Li, 2016)  Alpine conifers, that is, Picea likiangensis  and Taxus wallichiana . Meanwhile, the demographic history of P. armandii is complex according to ABC simulations, involving population expansions followed by strong bottlenecks and expansions from the late Miocene to Pleistocene. In particular, the expansion of lineages EH, SH, and QD began at 1.63 Ma,1.98 Ma and 2.42 Ma,respectively,all in the early Pleistocene and well before the LIG (0.14-0.12 Ma). The fact that a moderately cold climate has prevailed on the QTP before the LIG will have provided opportunities for each of the three lineages to have continued its range expansion throughout the LIG, as previously proved in Picea likiangensis .

| Origin of planted material
Although previous studies have carried out provenance trials in some plantations of P. armandii (Ma, 1989(Ma, , 1992, our study is the first to examine the full native and planted range of the species. Throughout its range, we found that at least nine of the 11 planted populations matched nearby wild populations in terms of both the dominating haplotype and genetic similarity for other markers (Figures 2, 3) and hence were sourced within the same region, as tended to be found for individual plantations by previous work (Ma, 1989(Ma, , 1992. Of the other two populations, p23 in QD has the chlorotype H1, which is mainly from the SH region; however, it resembles other QD populations for nuclear data, and moreover, the chlorotype H1 is also fixed in wild population p48, which is by far the closest geographically to p23 (Figures 1, 2; Table S1). Hence, much the likeliest origin for p23 is that it was sourced very locally, from the nearest wild material, implying that planted material, even within a region, does not all originate from a common pool of cultivated stock.
Planted population p8 from EH has a 1:1 ratio of haplotypes H1 and H2, which are, respectively, rare and dominant in EH, whereas H1 is far commoner in SH ( Figure 1). As with p23, however, this population resembles local wild populations for nuclear data ( Figure 2).
Hence, it was probably sourced from within EH, but not from the nearest sampled wild populations, p9 and p10, which do not contain haplotype H1. Instead, it might have been sourced from a more distant EH population such as p13 or p14 (both > 500 km away), or an unsampled or extinct population closer by. The high proportion of H1 here might be a chance outcome of a bottleneck event (see below). Crucially, while p23 indicates that some planted populations were probably sourced very locally even within a region, p8 indicates that this might not apply for all of them.
Because planted populations examined were all sourced from within the region where they occurred, these appear to have little potential for allowing gene flow between regions. This also means that, contrary to general expectations, the inclusion of planted populations did not obscure the genetic evidence of phylogeographic structure within P. armandii. The potential remains for planted material to promote within-region gene flow, especially where plantations are some distance from their source, and hence perhaps also local genetic swamping. However, for genetic swamping to occur, local adaptation of genotypes, and fitness variation between populations, must be small (Anttila et al., 2000;Hufford & Mazer, 2003). To test this possibility, within-region variation in climatic niche needs investigation alongside whether planted material is sometimes nonlocally sourced within a region.
To detect gene flow between planted and wild material, seedlings and saplings need to be sampled as there has not been time for two generations to be completed since planting (at least from plantations sampled). Moreover, markers must be developed that detect within-region geographic structure, through which introgression from elsewhere within the region, via material planted away from its source, can be detected.
Lower levels of diversity in planted than wild accessions (Tables S1, S6, S7, and S8) potentially indicate bottleneck events during or before establishment. However, planted population p31 was the only population from which four haplotypes were detected, so at least some planted populations seem to remain genetically diverse, and there may be variation in how planted seed is sourced, as well as from where.

| Anthropogenic and climatic impacts on the source of planted material
The process of introduction of tree species around the world is generally informed by niche concept and the principle of climatic niche similarity (Li, Zhang, Huang, Wen, & Du, 2018), giving rise to a sitespecies matching principle. Within P. armandii, earlier transplant experiments revealed strong local adaptation apparently driven by regional differences in mean annual temperature and extreme minimum temperature (Tang, 1995), leading to material from different regions differing significantly in cold resistance (Ma, 1989). while provenance studies have likely led to deliberate site-species matching in some instances (Ma, 1989(Ma, , 1992, local planting in some places might have come about through trial and error, or because planted seed was sourced from nearby trees merely for convenience. Attempts at afforestation using P. armandii in central China during the 1960s and 1970s were unsuccessful, due to material of one lineage (SH) being planted within the range of another (QB), leading to the dieback of trees in winter frost (Ma, 1992;Wang & Hong, 2004

| Implications for afforestation policy
To date, there has been little emphasis on assessing the genetic legacy of a widespread afforestation program, despite growing investment in large global afforestation projects (Payn et al., 2015). Our results build upon existing phylogeographic and historical evidence of coniferous tree species, to substantially advance our understanding of the contemporary genetic impact of past widespread afforestation. We also show that ecological niche models can be used to predict areas with suitable habitat and climate for introduction and plantation around the world. Despite concern that translocation of individuals between regions with different lineages might facilitate within-species introgression, we found that instead the consequence of this, at least in P. armandii, is more likely to be failure of plantations due to maladaptation. Earlier work provides examples of this: Southern material could barely survive northern winters, whereas northern material grew very slowly in the south (Ma, 1989(Ma, , 1992. Although the site-species matching principle has clearly worked for P. armandii so far, shifts in potential distribution ranges due to future climate change in P. armandii (Zheng, Gao, & Zhang, 2017) might alter this in the future, and material originally from warmer or wetter regions, such as the SH lineage, might become more suitable for future afforestation programs in the QD and EH lineages range.
Intraspecific genetic variation is essential for species adaptation and survival and can have profound effects on ecological processes, across communities and even ecosystems (Hughes, Inouye, Johnson, Underwood, & Vellend, 2008;Jordan, Breed, Prober, Miller, & Hoffmann, 2019). Neglecting this factor while implementing afforestation policy may create big risks for forest health in the future.
Low plantation diversity can result from bottlenecks, or from planted seed originating from small isolated remnant populations that lack adequate genetic diversity (Durka et al., 2017). Most planted forests are either monocultures or compositionally simple mixed forests, and few comprise much intraspecific genetic diversity, making them susceptible to abiotic and biotic threats exacerbated by global change (Verheyen et al., 2016). Overall, to create forests that form sustainable ecosystems, it is important to know the origin of planted material, and there is an urgent need to comprehensively evaluate the genetic effect of afforestation by other dominant afforestation tree species, based on both genetic and ecological perspectives.

ACK N OWLED G EM ENTS
This work was financially supported by the National Natural

D N A S EQU EN CE A N A LYS I S
Chloroplast and mitochondrial haplotypes were identified using DNASP v5.00.01 (Librado & Rozas, 2009), and a haplotype network