Genetic structuring in a Neotropical palm analyzed through an Andean orogenesis‐scenario

Abstract Andean orogenesis has driven the development of very high plant diversity in the Neotropics through its impact on landscape evolution and climate. The analysis of the intraspecific patterns of genetic structure in plants would permit inferring the effects of Andean uplift on the evolution and diversification of Neotropical flora. In this study, using microsatellite markers and Bayesian clustering analyses, we report the presence of four genetic clusters for the palm Oenocarpus bataua var. bataua which are located within four biogeographic regions in northwestern South America: (a) Chocó rain forest, (b) Amotape‐Huancabamba Zone, (c) northwestern Amazonian rain forest, and (d) southwestern Amazonian rain forest. We hypothesize that these clusters developed following three genetic diversification events mainly promoted by Andean orogenic events. Additionally, the distinct current climate dynamics among northwestern and southwestern Amazonia may maintain the genetic diversification detected in the western Amazon basin. Genetic exchange was identified between the clusters, including across the Andes region, discarding the possibility of any cluster to diversify as a distinct intraspecific variety. We identified a hot spot of genetic diversity in the northern Peruvian Amazon around the locality of Iquitos. We also detected a decrease in diversity with distance from this area in westward and southward direction within the Amazon basin and the eastern Andean foothills. Additionally, we confirmed the existence and divergence of O. bataua var. bataua from var. oligocarpus in northern South America, possibly expanding the distributional range of the latter variety beyond eastern Venezuela, to the central and eastern Andean cordilleras of Colombia. Based on our results, we suggest that Andean orogenesis is the main driver of genetic structuring and diversification in O. bataua within northwestern South America.

. Intense mountain building occurred during the middle Miocene (12 Ma) allowing the parallel formation of a new aquatic system known as "Pebas" in the western Amazon basin (Gregory-Wodzicki, 2000;Hoorn, Wesselingh, Hovikoski, & Guerrero, 2010;Meerow et al., 2015;Mora et al., 2010). At that time, the Andes may have started restricting gene flow between eastern and western rain forests on each side of the region, promoting vicariance processes through genetic divergence (Dick, Roubik, Gruber, & Bermingham, 2004;Trénel, Hansen, Normand, & Borchsenius, 2008). After the Pebas system disappeared and was replaced by the Amazon drainage system (10 Ma) and the establishment of the Amazon River (7 Ma), the Amazon foreland basins became overfilled by the Andean influx of water and sediments over millions of years (Hoorn, Wesselingh, Ter Steege, et al., 2010). This change in the continental hydrological system along with new events of intense tectonism during the Pliocene (5 Ma) allowed for the formation of palaeoarches in the Amazonian lowlands which promoted allopatric diversification between populations located on each side of the arches (Hoorn, Wesselingh, Ter Steege, et al., 2010;Hubert et al., 2007). The basins that formed after the uplift of the arches eventually became overfilled with Andean influx, hiding the arches underground and reconnecting Amazonian isolated biota Hubert et al., 2007).
Historically, biotic exchange has occurred between the Chocó and the Amazon through low passes in the Andes that have functioned as dispersal corridors. The Amotape-Huancabamba zone in southwestern Ecuador/northwestern Peru is a region where dispersal corridors, such as the Huancabamba depression or the Girón-Paute deflection, have facilitated historical cross-Andean dispersal in lowand midelevation lineages in an east-west direction and vice versa (Quintana, Peninngton, Ulloa Ulloa, & Balslev, 2017;Weigend, 2002Weigend, , 2004 during favorable climatic conditions (Haffer, 1967). Another identified dispersal corridor is Las Cruces mountain pass in the eastern cordillera of Colombia (Dick, Bermingham, Lemes, & Gribel, 2007). These dispersal corridors may have hindered diversification processes in the region by maintaining genetic connectivity between cross-Andean regions (Trénel et al., 2008).
Besides Andean uplift, Pleistocenic climatic shifts (2.5 Ma), as explained in the theory of refugia (Haffer, 1969), were believed to be the main drivers of diversification in the region through a continuous fragmentation process of Amazonian forest. Nevertheless, its role has lately been given less emphasis as both highland and lowland organisms already diversified during Andean orogeny before the Pleistocene (Antonelli & Sanmartín, 2011;Hoorn, Wesselingh, Ter Steege, et al., 2010;Hughes & Eastwood, 2006). Modern rainfall and temperature patterns seem to be related to species richness at large time scales, but their effect on short time scales is often less evident (Antonelli & Sanmartín, 2011;Eiserhardt, Svenning, Kissling, & Balslev, 2011;Field et al., 2009). Additionally, nutrient availability also seems to be an abiotic factor that explains biodiversity accumulation (Antonelli & Sanmartín, 2011;Tuomisto, Zuquim, & Cárdenas, 2014).
Oenocarpus bataua is a Neotropical palm that provides a good model for exploring the structuring of genetic diversity at the regional level due to its wide distribution in northern South America (Balick, 1986), growing in several ecoregions such as the Chocó, the Amazon basin, and the Andean slopes. Despite its wide geographical distribution, its intraspecific variability has been poorly studied with only two allopatric varieties described as follows: (a) O. bataua var.
Here, we describe the intraspecific genetic structure of the were studied only in a first step of the analysis in order to explore the genetic relationship between both varieties. We also compared the levels of genetic diversity and the inbreeding coefficients within populations to provide valuable information for its conservation and management. Finally, we propose additional research in order to improve the knowledge about the diversification and genetics of Neotropical plants. To our knowledge, this is the first work exploring the influence of Andean uplift on the regional genetic structure of a wild palm using its intraspecific genetic diversity.

| Statistical analysis
For all the sampled localities and individuals, we performed a linkage disequilibrium test for each pair of loci using the software Genepop v4.2 with default parameters (Raymond & Rousset, 1995) and also estimated the presence of null alleles with the software FreeNA (Chapuis & Estoup, 2007).

| Individual-based analysis
In order to determine a major population structure in O. bataua, a Bayesian clustering method included in the software Structure v2.3 (Pritchard, Stephens, & Donnelly, 2000) was used to assign all 644 individuals to different clusters. To determine the optimal number of clusters (K), Structure was run under the default model of ancestry and population intercorrelation (population admixture and correlated allele frequencies) without prior information about the samples' geographic origin. Five independent Markov chain Monte Carlo (MCMC) runs were performed using 10 4 burn-in generations followed by 10 5 sampling generations, K ranging = 1-10.
From these data, the ΔK statistic, developed by Evanno, Regnaut, and Goudet (2005), was computed to infer the optimal number of clusters (K).
To identify areas of genetic discontinuity within the distribution range of O. bataua, a spatial Bayesian clustering analysis was performed using Geneland software (Guillot, Mortier, & Estoup, 2005). Each individual's geo-referenced and genotypic information was used to determine its posterior probability of belonging to a certain cluster. One MCMC run was performed using the resulting K value obtained from the Structure analysis (following Evanno et al., 2005) applying the following parameters: 10 5 iterations, thinning = 1,000, allele frequencies correlated, and with uncertainty in the coordinates. Inbreeding coefficient (F IS ) and differentiation values (F ST ) among pairwise clusters were also obtained from Geneland. Based on the Bayesian clustering analyses, we obtained a hierarchical AMOVA in order to understand how the genetic variation is partitioned between varieties, between populations within varieties, and within populations, using the software Arlequin v3.5.2.2 (Excoffier, Laval, & Schneider, 2005) with 1,000 permutations.
We then repeated the Bayesian clustering analyses using only the samples previously identified as var. bataua (n = 566) in order to infer the population structure within this variety. Next, we used the mean F value computed by Structure to explore the dispersal history among genetic clusters of var. bataua. Under population admixture and correlated allele frequencies, the program returns an F value (F ST analogue) that describes the degree of genetic differentiation of a certain cluster from a hypothetical ancestral population (Falush, Stephens, & Pritchard, 2003). The dispersal history of the clusters can be inferred as a path from low to high F values, assuming constant rates of genetic drift in all the clusters (Trénel et al., 2008). The mean F value of each cluster identified within the var.
bataua was obtained after averaging the F values of the five independent MCMC with the determined optimal number of clusters (K).
Additionally, phylogenetic relationships between clusters were depicted to determine the sequence of divergence between the clusters. A neighbor-joining (NJ) tree was constructed in MEGA7 (Kumar, Stecher, & Tamura, 2016) using a mean matrix of allele frequency divergence among clusters (net nucleotide distance) that resulted from the analysis in Structure. The robustness of the NJ branches was evaluated using PHYLIP v3.6 (Felsenstein, 2005) through 1,000 bootstrap replications.

| Population-based analysis
At this level, we worked with 18 localities (n > 15) previously identified as var. bataua by the first Bayesian clustering analyses, and each locality was treated as a population. In order to determine how genetic diversity was spatially distributed, allelic richness (A) was calculated using the rarefaction procedure implemented in FSTAT v2.9.3 (Goudet, 2001)

| RE SULTS
Seven of the 21 pairs of loci were detected in linkage disequilibrium (p < 0.05; Supporting Information Table S2). All loci showed low average estimates of null allele frequency (AC5-3#4 = 0.024, AG5-5#1 = 0.013, AG1 = 0.022, Ob02 = 0.059, Ob06 = 0.057, Ob08 = 0.071, Ob16 = 0.068). The presence of null alleles did not affect our data as the F ST automatically generated for all loci by the F I G U R E 2 Genetic clusters identified within Oenocarpus bataua (var. bataua and var. oligocarpus) using 644 samples. These were identified with a spatial Bayesian clustering analysis conducted in Geneland (Guillot et al., 2005) using posterior probabilities to belong to one of K = 2 clusters as identified in Structure (Pritchard et al., 2000) with the statistical analysis developed by Evanno et al. (2005). Each point represents a sampled locality, while the lines represent the probability of membership to a determined cluster software FreeNA was similar before (0.179) and after (0.175) correction for null alleles.

| Individual-based analysis
Based on the ΔK statistic (Evanno et al., 2005), we determined the best K at K = 2 (Supporting Information Figure S1) with a clear separation between eastern (French Guiana, var. oligocarpus) and western (Chocó, Andean, and western Amazonian forests, var. bataua) populations. An exception to this pattern was the relatedness of the  was harbored within populations, as is expected for an allogamous and long-lived perennial species (Hamrick & Godt, 1990). Genetic variation between varieties was 13.65% and between populations within varieties was 13.12% (all p < 0.01**).
Within var. bataua, the Structure analysis identified a peak at K = 4 (Supporting Information Figure S1). The bar plot generated by Structure (Supporting Information Figure S3) and the Geneland analysis ( Figure 3)  F I G U R E 3 Four genetic clusters identified within Oenocarpus bataua var. bataua (AMO: Amotape-Huancabamba zone; CHO: Chocó rain forests; NWA: northwestern Amazonia rain forests + northwestern Bolivia; SWA: southwestern Amazonia rain forests) using 566 samples. These were identified with a spatial Bayesian clustering analysis conducted in Geneland (Guillot et al., 2005) using posterior probabilities to belong to one of K = 4 clusters as identified in Structure (Pritchard et al., 2000) with the statistical analysis developed by Evanno et al. (2005). Each point represents a sampled locality, while the lines represent the probability of membership to a determined cluster

| Population-based analysis
The lowland Amazonian populations of Intuto and Jenaro Herrera, located near Iquitos in Peru within the NWA cluster, had the highest genetic diversity (Table 3) Figure S4), we repeated the procedures obviating these populations. Then, a significant correlation (p = 0.045**) was detected between A and the distance from each population to Intuto, whereas no significant correlation was found between A and altitude (p = 0.054).
The inbreeding coefficient tended to be low with few exceptions (

| Andean uplift as possible driver of divergence
The four genetic clusters identified in the Bayesian analysis within var. bataua (Figure 3) correlated to major ecoregions recognized within northwestern South America (Dinerstein et al., 1995;Olson & Dinerstein, 2002;Weigend, 2002). We hypothesize that Andean uplift promoted three events of diversification that shaped the genetic structure of var. bataua into four clusters. Despite not being able to prove this hypothesis with our data, we will explore possible orogenic scenarios that can explain the divergence observed.
We suggest that a first diversification event in var. bataua occurred between the Chocó region and the Amazon basin. Cross-Andean divergence has been also reported for several rain forest trees in the Neotropics (Dick & Heuertz, 2008;Dick et al., 2003Dick et al., , 2007Hardesty et al., 2010;Motamayor et al., 2008;Rymer et al., 2012).
Although this seems like a logical explanation, we cannot discard the possibility that the cross-Andean distribution of O. bataua may be due to long-distance dispersal processes after the Andes reached its current height just 2.7 Ma (Gregory-Wodzicki, 2000; Mora et al., 2010). In this sense, a dated phylogeny of O. bataua populations would help to elucidate whether this diversification event, and the other two detected, shares a time frame with the Andean orogenic TA B L E 2 Inbreeding coefficients (F IS ) and pairwise fixation values (F ST ) obtained from Geneland (Guillot et al., 2005) for the four genetic clusters identified in Structure (Pritchard et al., 2000) within Oenocarpus bataua var. bataua. Divergence estimates from a hypothetical ancestral population (F) were obtained from Structure (Pritchard et al., 2000) Cluster  (Kumar et al., 2016) using a mean matrix of allele frequency divergence among clusters (net nucleotide distance) resulted from the analysis in Structure (Pritchard et al., 2000) that determined genetic clusters within Oenocarpus bataua var. bataua populations. The robustness of the neighbor-joining branches was evaluated using PHYLIP (Felsenstein, 2005)  however, the intraspecific diversification between both clusters has maintained to the present. The location of these two ancient basins presents different climatic conditions currently, being northwestern Amazonia a more humid and less seasonal region than southwestern Amazonia (Silman, 2007). Therefore, current climatic dynamics may be contributing to the maintaining of intraspecific genetic diversification in O. bataua as palms are highly sensitive to climatic conditions . Variation in climate can influence flowering phenology among populations (Welt, Litt, & Franks, 2015), which may alter their gene flow patterns (Franks & Weis, 2009) and even promote reproductive isolation (Martin, Bouck, & Arnold, 2007). It is worth mentioning that the location of the clusters NWA and SWA partially correlates with the location of two Pleistocenic forest refuges (Napo and East Peruvian) proposed by Haffer (1969); however, as the theory of refugia was shown to be based on sampling artifacts (Nelson,  TA B L E 3 Diversity values and inbreeding coefficients (F IS ) for the 18 Oenocarpus bataua var. bataua populations with n > 15. Allelic richness (A) was calculated using the rarefaction procedure implemented in FSTAT (Goudet, 2001), whereas expected (H e ) and observed (H o ) heterozygosity, and the inbreeding coefficient (F IS ) were obtained from Arlequin (Excoffier et al., 2005) hypothesis that these clusters developed after geographical isolation and posterior reproductive isolation. Therefore, the deter- The apparent disjunct distribution of the NWA cluster may be a sampling artifact caused by lack of sampling in more eastern localities. It is possible that the two sections of this cluster are linked by unsampled areas in western Brazil. A similar pattern of disjunct distribution was reported for T. cacao (Motamayor et al., 2008), where Amazonian populations in northern Peru were genetically related to populations in southwestern Brazil.

| Is var. oligocarpus distributed beyond eastern Venezuela?
The Amazonia (Gibbs & Barron, 1993). Therefore, the divergence between var. bataua and var. oligocarpus was not related to the uplift of the Andes. The molecular differentiation found between these two varieties agreed with previous studies (Montúfar, 2007;Montúfar & Pintaud, 2008). Despite the strong differentiation between the two varieties, our data were insufficient to support the hypothesis of the botanists A. Grisebach and H. Wendland, who originally described J. oligocarpus (var. oligocarpus) in 1864 as a distinctive species from O. bataua (Balick, 1986). The study of gene flow between the two varieties in sympatric zones (probably within the Brazilian and Venezuelan Amazon) would enhance our knowledge about the genetic patterns between them.
Furthermore, the implementation of studies that determine the presence or absence of reproductive isolation (floral morphology, phenology) between the two varieties would help to cast light on their biological divergence.
The high genetic diversity of O. bataua also was consistent with the high species diversity of the genus Oenocarpus in this region.
The locality of La Pedrera in southeastern Colombia (~350 km from Iquitos) harbors the highest Oenocarpus species diversity, with six described species (Bernal, Galeano, & Henderson, 1991;Galeano & Bernal, 2010) The high diversity harbored within northwestern Amazonia may be related to high resource availability expressed as annual rainfall and soil cation concentration (Antonelli & Sanmartín, 2011;Tuomisto et al., 2014). This region presents high rates of water availability and climate stability due to convective rain caused by the Andes even during glacial periods (Kristiansen et al., 2011;Pitman, 2000;Tuomisto et al., 2014). It also harbors higher nutrient soils than central and eastern Amazonia due to the deposition and accumulation of material eroded during Andean orogeny (Higgins et al., 2011;Hoorn, Wesselingh, Ter Steege, et al., 2010;Tuomisto et al., 2014). It is possible that diversity depends on resource availability; however, we have ignored the specific mechanisms for this.
Oenocarpus bataua maintained medium levels of genetic diver-

| Conclusions and future perspectives
We detected three events of genetic diversification within corrected the manuscript.

DATA ACCE SS I B I LIT Y
Microsatellite and geographic data are available at https://doi. org/10.5061/dryad.1r4p8.