Population genetic structure and adaptive differentiation of iron walnut Juglans regia subsp. sigillata in southwestern China

Abstract Southwestern (SW) China is an area of active tectonism and erosion, yielding a dynamic, deeply eroded landscape that influences the genetic structure of the resident populations of plants and animals. Iron walnut (Juglans regia subsp. sigillata) is a deciduous tree species endemic to this region of China and cultivated there for its edible nuts. We sampled 36 iron walnut populations from locations throughout the species' range in SW China and genotyped a total of 765 individuals at five chloroplast DNA regions and 22 nuclear microsatellite loci. Species distribution models were produced to predict the evolution and historical biogeography of iron walnut and to estimate the impacts of climate oscillations and orographic environments on the species' demography. Our results indicated that J. regia subsp. sigillata had relatively low genetic diversity, high interpopulation genetic differentiation, and asymmetric interpopulation gene flow. Based on DIYABC analysis, we identified two lineages of J. sigillata in southwestern China. The lineages (subpopulations) diverge during the last glacial period (~1.34 Ma). Southwestern China was a glacial refuge during the last glacial period, but increasingly colder and arid climates might have fostered the fragmentation of J. regia subsp. sigillata within this refugium. Finally, we found that recent habitat fragmentation has led to a reduction in population connectivity and increased genetic differentiation by genetic drift in isolated populations. Our results support a conclusion that geological and climatic factors since the Miocene triggered the differentiation, evolutionary origin, and range shifts of J. sigillata in the studied region.


| INTRODUC TI ON
Southwestern (SW) China is a unique alpine area with a range of climates, complex topography, and a high proportion of endemic and relict flora (about 29% of species). It is considered as a hot spot for vascular plant biodiversity (López-Pujol, Zhang, Sun, Ying, & Ge, 2011;Myers, Mittermeier, Mittermeier, Da Fonseca, & Kent, 2000) and the core region of genetic diversity for many species (Brookfield, 2008). Of particular biogeographic interest are the Hengduan Mountains, which are located in the southeastern part of the Qinghai-Tibet Plateau (Du, Hou, Wang, Mao, & Hampe, 2017). This range includes mountains in Sichuan province, Yunnan province, and eastern Tibet Autonomous Region (Zhao, Gugger, Xia, & Li, 2016). The southern extreme of the Hengduan Mountain chain includes the Wuliang Mountains, which are located in northwestern Yunnan, the Yunnan-Guizhou plateau, and the convergence point of the Central and South Asian tropical zones (Myers et al., 2000). To examine the connections among geography and climatic oscillations and the processes that drive genetic differentiation, we examined the biogeography of the perennial, woody species iron walnut (Juglans regia subsp. sigillata).
Juglans sigillata was identified as a new species by Dode, whose taxonomy was based on morphological differences J. sigillata individuals are distinguished from J. regia by their relatively large compound leaves, rugose trunks, thick endocarp, and deeply pitted nut surfaces (Dode, 1909a(Dode, , 1909b. Yang and Xi (1989) also suggested that J. sigillata and J. regia are ecological types of one species based on the isoenzyme peroxidase (Yang & Xi, 1989). On the basis of RAPD markers, Wu et al. (2000) determined J. sigillata to be an independent species, and an analysis of ITS, RFLP, and cpDNA affirmed that J. regia and J. sigillata are distinct (Aradhya et al., 2007). In general, the most recent studies of J. regia and J. sigillata conclude they are a single species (Wang, Pan, Ma, Zhang, & Pei, 2015;Wang et al., 2008).
Transcriptome data generated by high-throughput sequencing have been an excellent resource for SSR marker development (Dang et al., 2016;Hu et al., 2016). EST-SSRs have a variety of uses; some sequences contain polymorphisms that are transferable to related taxa . Analysis of EST-SSRs in populations where J. regia and J. sigillata are sympatric in SW China revealed that the J. sigillata and J. regia populations were divided into two genetic clusters with frequent gene introgression (Yuan, Zhang, Peng, & Ge, 2008), but we found that iron walnut should be considered a subspecies or landrace of J. regia based on genotype by sequencing and EST-SSRs data . Whatever the status of J. sigillata, the evolutionary and ecological changes that led to the current genetic patterns remain understudied.
Biogeographical studies of angiosperms frequently combine nuclear markers, such as SSRs, which are inherited biparentally, with chloroplast markers that are inherited matrilineally. By comparing results from both types of markers across the same sampled populations, complementary views of a species' evolution, genetic structure, differentiation, and gene flow can be obtained (McCauley, 1995;Mohammad-Panah, Shabanian, Khadivi, Rahmani, & Emami, 2017).
Here, we use a multidisciplinary approach including molecular phylogeographic, ecological niche modeling, and phylogenetic approaches to investigate the biogeography, genetic structure, and demographic history of iron walnut in China. Specifically, we used 22 simple sequence repeat (EST-SSR) markers and sequence variation at five chloroplast fragments (cpDNAs) to (a) determine the genetic diversity and population structure of J. sigillata at chloroplast DNA sequences and  Genotypes of all DNA samples were detected using 22 pairs of microsatellite primers developed for Juglans (Dang et al., 2015(Dang et al., , 2016Hu et al., 2015Hu et al., , 2016. Each forward primer was marked with fluorescent dye (FAM, TAMRM, HEX, and ROX). The PCR products were detected using software GeneMarker (Holland & Parson, 2011). A subset of 186 individuals from 31 J. sigillata populations and 12 individuals from two J. regia populations, and three black walnut (J. nigra) individuals (used as an out-group) were sequenced at five chloroplast DNA regions (Table S1). All chloroplast sequences were deposited in GenBank under accession numbers MH606007-MH606019.

| Genetic diversity statistical analyses based on EST-SSR data
A set of 765 individuals from 36 J. sigillata populations plus 12 J. regia populations (269 individuals) were evaluated at 22 EST-SSRs. We calculated F ST to identify outlier EST-SSR loci as potentially under selection (Tsuda, Nakao, Ide, & Tsumura, 2015). Non-neutrality of loci was detected using Arlequin version 3.5 (Excoffier & Lischer, 2010) ( Figure S1). The genetic diversity parameters were computed by the GenALEx 6.5 (Peakall & Smouse, 2012). The Inverse Distance Weighted (IDW) of spatial interpolation analysis in the Geographic Information System (GIS) software ArcGIS 10.0 was implemented to display the geographic patterns of the number of alleles (N A ), expected heterozygosity (H E ), allelic richness (R S ), genetic differentiation (F ST ), and private allele richness (P AR ) for 36 populations; then, we displayed the geographic patterns of the haplotypes based on cpSSR markers .

| Population structure analysis and population differentiation statistics
Genetic structure of 36 iron walnut populations was predicted using the software STRUCTURE version 2.3.4 (Evanno, Regnaut, & Goudet, 2005) based on 22 SSRs (12 neutral and 10 non-neutral) in 36 iron walnut populations sampled across their native range.

| Estimation of divergence time
BEAST v 1.8.0 was used to estimate phylogenetic relationships and divergence times between lineages (Drummond, Suchard, Xie, & Rambaut, 2012). We choose J. nigra as the out-group. Calibration of the tree was based on fossil evidence, which indicates the time of divergence between Juglans sect. Rhysocaryon and sect. Dioscsryon was 45 Ma (Aradhya et al., 2007;Bai, Wang, & Zhang, 2016). We used the GTR + I+G nucleotide substitution model, an uncorrelated log-normal clock, and a Yule process tree prior to estimate the divergence times of the main clades. Effective sample size (ESS) value was > 200. Bayesian skyline plots (BSP) were used to conclude variation in effective population size (Ne) by BEAST v1.8.3 ).

| Inferring demographic history of J. sigillata using approximate Bayesian computations (ABC) analysis
We used DIYABC v. 2.0 software to infer recent colonization history using an approximate Bayesian computation algorithm (ABC) (Cornuet et al., 2014). Based upon previous studies (Tsuda et al., 2015), the generation time of walnuts was assumed to be 50 years, and we pooled subsets of J. sigillata samples into two subpopulations as inferred by STRUCTURE ( Figure S2). All haplotypes of J. sigillata clustered into two clades (clade A and clade B). Gene pool I consisted of 377 individuals from 22 populations, and gene pool II consisted of 388 individuals from 14 populations. The J. regia population was considered gene pool III ( Figure S3). We tested 13 competing broadscale scenarios, and total 13 million simulations were run for all scenarios (13 scenarios; Figure S4).

| Historical gene flow, contemporary gene flow, and genetic barriers
To explore historical gene flow within J. sigillata populations, we employed MIGRATE v 3.6.1.1 (Beerli, 2006). All samples of J. sigillata were divided into two subpopulations and one of J. regia based on STRUCTURE analysis using SSR data. In order to ensure the veracity and consistency of our results, we ran Migrate five independent times. Genetic barriers were investigated using Monmonier's maximum difference algorithm as implemented in the software Barrier v.

| Species distribution modeling
Species distribution modeling for J. sigillata was carried out in MAXENT v. 3.3.3 Phillips & Dudík, 2008). A total of 274 distribution records were retrieved from the Chinese Virtual Herbarium and National Specimen Information Infrastructure. Eight variables were discarded from models because they were highly correlated environmental variables (Pearson's correlation coefficient >0.85), leaving 11 bioclimatic variables. To determine whether the calculated niche similarity metrics were significant, we performed identity tests in ENMTOOLS (Warren, Glor, & Turelli, 2010).

| Distribution of cpDNA haplotypes
We observed 11 chloroplast haplotypes over a combined 1,597 bp based on five cpDNA fragments (Table S1) Table S1). Based on comparison with the out-group J. nigra (H11 and H10), which differed by 236 mutation steps, haplotypes H1, H7, H8, and H9 were more ancient than H3, although H3 was the most widespread and most common. All individuals of J. regia showed haplotype H3 or H9 (Figure 1b).  (Table S2). The gene diversity within populations (Hs) ranged from 0.018 to 0.596 (Table S2) (Figure 3). However, the iron walnut samples separated into two subpopulations based on all EST-SSR markers (cluster A and cluster B in Figure S2). The second highest peak for ΔK was 3.

| Population structure based on EST-SSR data
Applying Bayesian analysis of genetic structure, all 36 sampled J. sigillata locations plus 12 J. regia sites using only the 12 neutral loci, and the most likely number of populations was K = 3 ( Figure S3). The AMOVA on cpDNA and EST-SSR data provided evidence for a relatively high level of genetic differentiation among sampled sites; a considerable percentage (24.72% and 17.42%, respectively) of the genetic variation was attributed to differences among sampled location (Table 1) Table S3), whereas in the opposite direction, it was predicted to be 2.34 (95% CI: 0.025-0.995; Table S3). We located three statistically significant barriers to gene flow when all iron walnut populations were included using Monmonier's maximum difference algorithm ( Figure S5).

| Iron walnut distribution during the Late Quaternary
The Maxent model of J. sigillata was supported by a high predictive  (Qiu, Fu, & Comes, 2011). Juglans sigillata is indigenous to China, distributed mainly in SW China sympatric with J. regia (Wang et al., 2014). Juglans regia and J. sigillata were designated as the sole members of section Juglans (Gunn et al., 2010).

Studies of gene flow and introgression have concluded J. regia
and J. sigillata are particularly closely related, and some have questioned whether they are distinct (Wang et al., , 2008. Aradhya et al. (2007) considered J. sigillata distinct based upon ITS, RFLP, and cpDNA sequence data and morphology. We found that the J. sigillata samples appeared completely embedded within J. regia based on a phylogeny analysis using whole genome data . Our data provide the evidence that J. sigillata is a subspecies or, perhaps, a landrace of J. regia (Figure 4; Figure S3).

Shared chloroplast haplotypes among closely related populations
can be explained by incomplete lineage sorting, intraspecific gene exchanges, or a shared most recent common ancestor (Petit, Bodénès, Ducousso, Roussel, & Kremer, 2003;Yang et al., 2016). In this study, we used nuclear microsatellite and cpDNA markers and ecological niche modeling to reconstruct the phytogeographic history of J. sigillata and identify forces that most influenced the species' genetic structure.

| Patterns of genetic diversity and genetic differentiation
We found the centers of genetic diversity of J. sigillata were located at Hengduan Mountains, Dalou Mountains, and small region of Yunnan province based on IDW analysis (Figure 2). The chloroplast haplotype H3 (which is also found in J. regia) was found in trees across the entire range of J. sigillata [with high SSR genetic diversity that could be gene flow from J. regia (Figure 1). The diversity of chloroplast haplotypes was also high in populations from Hengduan Mountains, Dalou Mountains, Wuling Mountains, and a small region of Yunnan province (Figure 2). These results indicated that hybridization and gene introgression from J. regia may affect the genetic diversity of J. sigillata in these areas, resulting in populations with high diversity (Figures 1 and 2). Each color corresponds to a suggested cluster, and a vertical bar represents a single individual. Population codes are indicated below. (c) Distribution of delta K for K = 3 to 10 to determine the true number of populations (K) as described in Evanno et al. (2005). Mean log likelihood of the data at varying estimates of K

TA B L E 1 Analysis of molecular variance (AMOVA)
northeastern Yunnan-Guizhou plateau, near Tongren. These may be regarded as centers of genetic diversity (Figure 2), and possibly refugia during LGM.
The most important forces affecting the spatial genetic structure of J. sigillata are those related to dispersal and interactions with humans. In natural walnut stands, pollen-mediated gene flow probably rarely disperses alleles across distances >300 m (Pollegioni et al., 2015). Walnut seeds are large and heavy, so their dispersal by granivores (mostly squirrels) likely takes place over even more limited distances. As an economically important tree species, walnut plants were inevitably disturbed by human activities such as deforestation and selection (Lee & Lee, 1997), particularly over the past 100 years. Human movement of germplasm (seeds as well as clones) can mix haplotypes much faster than chloroplast capture in natural populations. Human exploitation of a wild resource often results in a decline in effective population size (Ne), loss of genetic diversity, population fragmentation, and local extirpation (Lefèvre, 2004;Wang et al., 2008). Tree species that grow in close association with humans are subject to unique evolutionary and ecological processes. For instance, artificial selection pressures lead to morphological changes in cultivated populations, dispersal by humans expands the natural range of species, and range expansion can lead to sympatry and hybridization with otherwise allopatric congeneric species like J. cathayensis. Humans likely contributed to both local and long-range dispersal of J. sigillata seeds through trade (Gunn et al., 2010). Aside from human dispersal, differences in the frequency of self-pollination, geographic isolation, local adaptation, phenological differences, and stochastic events including disease or pest outbreaks are all likely to have affected the spatial genetic structure of iron walnut.
Generally, F ST estimates for cytoplasmic markers are higher than for nuclear markers (Chen, Lu, Zhu, Tamaki, & Qiu, 2017;Petit et al., 2005). This was not true, however, for J. sigillata. Iron walnut haplotypes (Printzen, Ekman, & Tønsberg, 2003). In support of the second hypothesis, except for two haplotypes (H2 and H5) specific to population QZ and LM, the remaining haplotypes were widely shared among populations, especially H3, which was found in all populations ( Figure 1a). Therefore, the hypothesis of recent fragmentation of a large population with recurrent gene flow happened historically is more probable. This hypothesis is corroborated by ABC and BEAST analysis (Figure 4). What's more, the overall F ST value F I G U R E 4 (a) BEAST-derived chronograms of 11 haplotypes of Juglans sigillata, Juglans regia, and Juglans nigra based on five chloroplast DNA (cpDNA) fragments. (b) Bayesian skyline plot derived from 186 sequences of Juglans sigillata. The thick solid line represents the median estimate, and the light blue area corresponds to the 95% highest posterior density (HPD). The divergence times estimated using a relaxed molecular clock model with fossil data (red dots). (c) The three scenarios tested in DIYABC. In these scenarios, t# represents timescale in terms of the number of generations and N# represents the effective population size during the time period (e.g., 0-t1, t1-t2). The summary shows scenario 4 is the most likely scenario in DIYABC of 0.266 from nuclear microsatellite data is relatively high as compared with other wind-pollinated temperate trees (Lind & Gailing, 2013;Rusanen, Vakkari, & Blom, 2003 (Printzen et al., 2003).  Table S3). According to the previous studies, natural gene flow across large distances in SW China was unlikely, perhaps indicating the importance of human movement of J. sigillata seeds . An important result from our investigation is that gene flow rate between pairs of populations does not correlate with their geographical distances (r = .011, p = .400), providing a potential explanation for why a Mantel Test comparing the matrix of genetic and geographic distance was not significant (Bai et al., 2016).

| Population structure, barrier, and gene flow
Iron walnut is a wind-pollinated tree, the seeds or pollen dispersed by winds, born at 1,300-3,300 m above sea level on hillsides or valleys. In Yunnan-Guizhou plateau region, especially Yunnan province, most of the areas are characterized with highly dense mountain streams and lush vegetation. It maybe has served as effective barriers to seed and pollen dispersal and reduced the potential for long-distance seed dispersal. The genetic barrier analysis revealed three mains statistically barrier exist among populations, including the barrier separated population DL from four adjacent populations (SG, YB, YP, and ST). The other two barriers also effect gene flow between populations of J. sigillata ( Figure S5). This may indicate that local populations suffer from a deficit in outcross pollen, and inbreeding depression due to selfing or mating between close relatives (Qiu, Luo, Comes, Ouyang, & Fu, 2007). Values of F IS per population ranged from −0.29 to 0.28, with an average of 0.05, indicating an overall slightly excess of homozygotes. In other words, a selfing breeding system might exist in J. sigillata that is known to be self-compatible. IBD was significant for J. sigillata in SW China ( Figure S5), so geographical distance likely restricted gene flow and increased genetic differentiation (reflected in high F ST values).

| Population history and range dynamics
Phylogenetic analyses using both Bayesian and parsimony methods partitioned our samples into two distinct cpDNA clades (Figure 4a).
Differentiation time of J. regia and J. sigillata was 3.13 Ma, and it showed two species separated in the Pliocene-Pleistocene. Using a Bayesian dating method, we estimate the divergence times between clade A and clade B to be 2.65 Ma (Figure 4a). Our molecular dating by cpDNA and SSR data both fell into the early Pliocene-Pleistocene.
The uplift of the Yunnan-Guizhou plateau in SW China likely occurred in the late Miocene-Pliocene (Clark et al., 2004;Favre et al., 2015), a period of global cooling and intensification of Asian monsoons. Habitat fragmentation caused by the Yunnan-Guizhou plateau uplift during Miocene-Pliocene may have fostered intraspecfic divergence in this region He & Chen, 2006).
The current geographical distribution of J. sigillata appears to reflect three scattered refugia in the Yunnan-Guizhou plateau, Hengduan mountain, areas near the Wuliang mountains, and a refugium somewhere in the northeastern portion of the species' distribution (near Dalou mountain and Wuling mountain). These putative refugia have been recognized as centers of plant diversity. However, the refuge of the northeastern species distribution has not appeared in the SDM model, and this might be because populations along this potential route may have gone extinct, but it will occur again in the simulation of future (Figure 5e). Our SDM analyses also suggested that J. sigillata had experienced ranges shrinkage, retreated to the southern region about 25°N from LIG to LGM within refugia, which was similar to Juglans regia . Then, the species underwent a slight northeastward expansion, but the effective population size did not exceed LGM. This phenomenon is the same as the result of the BSP ( Figure 4c); overall, the effective population experienced a reduction at 0.25 Ma in the process. In general, the most reason is that increasingly colder and arid environment has occurred in LGM; afterward, climate is getting warmer.

| CON CLUS IONS
In short, we integrated molecular phylogeography, SDMs, and phylogenetic approaches to understand intraspecific divergence, evolutionary history, and range dynamics of J. sigillata. Using coalescent-based methods and ABC analysis, we determined the diver-