The DNA history of a lonely oak: Quercus humboldtii phylogeography in the Colombian Andes

Abstract The climatic and geological changes that occurred during the Quaternary, particularly the fluctuations during the glacial and interglacial periods of the Pleistocene, shaped the population demography and geographic distribution of many species. These processes have been studied in several groups of organisms in the Northern Hemisphere, but their influence on the evolution of Neotropical montane species and ecosystems remains unclear. This study contributes to the understanding of the effect of climatic fluctuations during the late Pleistocene on the evolution of Andean mountain forests. First, we describe the nuclear and plastidic DNA patterns of genetic diversity, structure, historical demography, and landscape connectivity of Quercus humboldtii, which is a typical species in northern Andean montane forests. Then, these patterns were compared with the palynological and evolutionary hypotheses postulated for montane forests of the Colombian Andes under climatic fluctuation scenarios during the Quaternary. Our results indicated that populations of Q. humboldtii have high genetic diversity and a lack of genetic structure and that they have experienced a historical increase in connectivity from the last glacial maximum (LGM) to the present. Furthermore, our results showed a dramatic reduction in the effective population size followed by an expansion before the LGM, which is consistent with the results found by palynological studies, suggesting a change in dominance in Andean forests that may be related to ecological factors rather than climate change.

mountainous ranges of the region such as the Andes are apparently less studied.
Tree species play an important role among the variety of biological groups used as models for the study of biogeographic processes.
From the Nearctic to the Neotropics, genera such as Quercus have been intensively studied from macro-and microevolutionary perspectives, allowing us to understand not only how Nearctic lineages have diversified and radiated into the Neotropics (Hipp et al., 2014(Hipp et al., , 2018 but also how climatic and geological features have shaped species distributions (Peñaloza-Ramírez et al., 2020 and references therein). However, there are still few phylogeographic studies of trees distributed in the Neotropical montane forests.
The Andean region, particularly the northern Andes, is a global biodiversity hotspot which harbors some of the highest numbers of endemic plant and vertebrate species in the world (Myers et al., 2000). Despite considerable research effort in scientific areas such as paleoecology and palynology, the evolutionary processes that led to this diversity and its current distribution patterns are not completely understood (Hazzi et al., 2018). In the northern Andes, Quercus humboldtii Bonpl. (1805) is a characteristic element of montane ecosystems and the single representative of the genus.
Interestingly, numerous paleoecological studies of montane forest evolution during the Quaternary, including the role of Quercus in forest dynamics, have been conducted in this region (Hooghiemstra & van der Hammen, 2004 and references therein).
Recently, Hooghiemstra and Flantua (2019) reviewed the studies of northern Andean fossil pollen records for the upper and the lower montane forests (UMF and LMF, respectively), which are the biomes where Q. humboldtii is distributed (Hooghiemstra, 2006;Rangel-Ch & Avella, 2011). They describe that during the last glacial maximum (21 ka BP to ~14 ka BP), a period characterized as extremely cold and dry (van der Hammen, 1974), the upper forest limit reached an elevation of ~2,000 m (van der Hammen & Cleef, 1986), and the UMF was displaced to a lower elevation and compressed by ~400 m compared to today's altitudinal range (Figure 1a,b). However, because of the Andean topography, the available surface area for the UMF varies only slightly (2.4%) between the LGM and the present . The LMF was also displaced altitudinally to a range of less than half the vertical span of the current altitudinal distribution. But, in contrast with the UMF, this vertical compression (~700 m) resulted in a reduction of ~42% of the superficial area   (Figure 1c,d). Following ~14 ka BP, the global temperatures became less cold, and the upper forest limit migrated upslope to eventually reach its current position (3,200-3,400 m) (Hooghiemstra & van der Hammen, 2004) However, distribution patterns can be more complex, and some species have crossed these barriers (Cadena et al., 2007;Muñoz-Ortiz et al., 2014), suggesting long-distance dispersal or the transience of barriers (Sanín et al., 2017). Specifically, for Q. humboldtii, previous palynological and floristic studies have proposed migration paths that allowed this species to colonize different mountain ranges (van der Hammen, 2008;Rangel-Ch & Avella, 2011). Although these routes were proposed in the context of colonization, they can be seen as possible ways in which the oak forests remained connected over time; therefore, populations of Q. humboldtii are expected to have maintained gene flow.

F I G U R E 1
Graphical representation of the historical changes in altitudinal ranges for the upper montane forest (UMF; a and b) and lower montane forest (LMF; c and d) . Columns represent altitudinal ranges for the last glacial maximum (LGM; a and c) and present day (PD; b and d). Next to each map, in the middle columns, is shown three-dimensional solid of revolution of the variation in superficial area (x-and y-axis) along elevation (z-axis) for each of the main Colombian Andes cordilleras. Green color represents the geographic altitudinal range occupied by vegetation belts (dark green UMF; light green LMF) during each period. Superficial area was calculated for elevation bins of 100 m A few models have been recently proposed to describe the vegetation history within Neotropical mountainous regions. Each focuses on different factors that might have shaped recent evolutionary processes, such as the joint effect of precipitation and temperature (Ramírez-Barahona & Eguiarte, 2013), the interaction of climate, topography, and volcanism at a microevolutionary scale (Mastretta-Yanes et al., 2015), and the effect of the interaction of fluctuating climate and topography on the degree of historical connectivity from a macroevolutionary perspective (Flantua & Hooghiemstra, 2018;Hazzi et al., 2018). These models are hypotheses against which phylogeographic data can be compared to improve the knowledge of general evolutionary processes in these landscapes.
As far as we know, there are no phylogeographic studies that evaluate the effects of these range shifts on historical population sizes and connectivity for UMF and LMF tree species, which contrasts with the growing number of studies evaluating phylogenetics and phylogeography of species of the Andean paramo . Moreover, the abundance of palynological data makes the Andes an optimal scenario for testing historical demography hypotheses about the effect of historical climatic changes on the elements of the montane forest in the Neotropics. Thus, this study aims to evaluate the effects of the climatic fluctuations that have occurred since the last glacial maximum on the population history of Q. humboldtii, a typical element of the northern Andean Montane Forest. Specifically, we described the nuclear and plastidic patterns of genetic diversity and genetic structure. We analyzed the historical demography and landscape connectivity of this species. Finally, we compared our results with the palynological and evolutionary hypotheses postulated for montane forest dynamics in the Colombian Andes under climatic fluctuation scenarios during the Quaternary.

| Study species
Quercus humboldtii (Bonpl. 1805) is a large tree that produces midsized acorns (2-3 cm long) annually and has evergreen lanceolate leaves (Müller, 1942). It is distributed in northern South America F I G U R E 2 Quercus humboldtii distribution is represented by occurrence records (gray triangles) and sampling locations included in this study (yellow circles). In the background, elevation is displayed in a grayscale where light colors correspond to lowlands and darker colors to highlands. COc, Cordillera Occidental; CC, Cordillera Central; COr, Cordillera Oriental; CM, Macizo Colombiano; SD, Serranía del Darién; SSL, Serranía de San Lucas; NP, Nudo de los Pastos ( Figure 2), from the Colombian Darién Gap south to the Nariño region in southern Colombia through the Andean Mountains, in an altitudinal range from 774 to 3,200 m (Pulido et al., 2006;Rangel-Ch & Avella, 2011). Generally, oak forests in Colombia occur in areas with a mean annual temperature range of 10-13°C, a thermal limit at 24°C, and precipitation up to 2,800 mm (Avella et al., 2017).
Andean and sub-Andean oak forests are formed by multiple phytosociological associations. However, they are generally characterized by the dominance of tree strata, particularly Q. humboldtii (Pulido et al., 2006).
Polymerase chain reactions (PCRs) were performed using the QIAGEN multiplex PCR kit with a final volume of 5 μl containing 1X multiplex PCR master mix, 0.25 mM of each primer, 20 ng of DNA, and dH 2 O. Amplification was performed by multiplexing groups of cpSSRs and nSSRs as follows: (i) two primer groups for cpSSRs (the first containing cmcs3, cmcs4, cmcs5, and cmcs6 and the second containing cmcs2, cmcs7, cmcs10, cmcs12, and cmcs14) and (

| Molecular data analysis
Each size variant for the evaluated microsatellite loci for both cpSSR and nSSR was defined and manually checked using GeneMarker v2.4.6 (Hulce et al., 2011). Binning was performed using the TANDEM (Matschiner & Salzburger, 2009) and Flexibin (Amos et al., 2007) programs for the nSSR and the cpSSR database, respectively. Nuclear microsatellite loci were tested for the presence of null alleles. First, all individuals were grouped as a single population using Micro-Checker 2.2.3 (Van Oosterhout et al., 2004) and then for each sampling location separately using FreeNa (Chapuis & Estoup, 2007). Because the estimation of null allele frequency can be influenced by inbreeding and vice versa, the INEST 2.2 (Chybicki & Burczyk, 2009) program was used to estimate inbreeding coefficients and null allele frequency simultaneously. All the possible models were tested using 500,000 cycles, out of which 50,000 were used as burn-in and 1,000 were kept for calculation of the results. Deviance information criterion (DIC) values for all the models were compared to choose the best model. To evaluate the influence of null alleles on the estimation of the genetic structure, FreeNa (Chapuis & Estoup, 2007) was further used to calculate F ST with and without the ENA correction.

| Genetic diversity
For the purpose of describing the genetic variation within populations, haplotype richness, rarefied haplotype richness, genetic diversity with unordered alleles (h sensu Pons & Petit, 1996), and nonstandardized genetic diversity with ordered alleles (v sensu Pons & Petit, 1996) were calculated for the 22 sampling locations (226 individuals) using cpSSR loci and the programs SPAGeDi 1.1 (Hardy & Vekemans, 2002) and Arlequin 3.5.2.2 (Excoffier & Lischer, 2010).  (Excoffier & Lischer, 2010) and GenAlEx 6.5 (Peakall & Smouse, 2006, and the diveRsity (Keenan et al., 2013) package in R (R Core Team, 2019). The previously mentioned genetic diversity estimators were calculated at the population and regional levels (Cordillera Occidental, Cordillera Central and Cordillera Oriental; Figure 2). The differences in the number of genotyped individuals between the cpSSR and nSSR analyses were due to failure in the amplification of several microsatellite loci after different trials in a few specific individuals. Only those with a maximum of one locus of missing data were retained in the nSSR database.

| Genetic structure
Analyses of molecular variance (AMOVAs) were performed to describe the partitioning of the genetic variation between groups, between sampling locations within groups, and within sampling locations for both cpSSR and nSSR datasets. The group categories were defined as the regional levels described above, which correspond to the three main cordilleras of the Colombian Andes. The AMOVA was calculated considering both F ST (based on an infinite allele mutation model, IAM) and R ST (based on a stepwise mutation model, SMM) using 10,000 permutations in Arlequin 3.5.2.2 (Excoffier & Lischer, 2010).
The grouping of individuals based on their shared genetic information (i.e., genetic structure) was further evaluated considering the geographic location. For this purpose, the package tess3r (François et al., 2006) was implemented in R 3.6.1 (R Core Team, 2019) for the analysis of the nSSR dataset. This algorithm is based on the optimization of least squares and the factorization of a genetic distance matrix constrained by geography (Deng et al., 2011;François et al., 2006;Frichot et al., 2014). The algorithm searches for ancestry coefficients of the K genetic groups among individuals, based on the concept that individuals within a close geographic proximity are more likely to share ancestral genotypes (François et al., 2006).
The algorithm was run using the function's default parameters for a range of genetic groups (K) from 1 to 22, with 100 replicates for each K. In each run, 5% of the data was masked as missing data and was used to test the predicted results through a cross-entropy criterion (Frichot et al., 2014). The optimal value of K was determined through the lowest value of the cross-entropy criterion (RMSE) before it plateaued or increased again (Frichot & François, 2015).
To test for the presence of phylogeographic structure, in the case of cpSSRs, the observed and permuted values of N ST (N ST and pN ST , respectively) were compared. N ST is an F ST analog that takes into account the genetic distances between haplotypes. The pN ST value should approximate the value of G ST , which considers only the probability of identity by descent among individuals among populations. Therefore, significant differences between N ST and pN ST are indicative of phylogeographic structure, meaning that related haplotypes are located in the same populations (Pons & Petit, 1996).
In a similar way, phylogeographic structure can be tested by comparing R ST and permuted R ST (pR ST ) for nSSRs. R ST is a SMM-based measure of differentiation, which is calculated from differences in allele sizes. Permuted R ST is expected to approximate the value of F ST ; therefore, these values can be used to infer the relative importance of stepwise mutation versus drift in population differentiation (Hardy et al., 2003). All of the abovementioned values were calculated using SPAGeDi 1.5 (Hardy & Vekemans, 2002). In addition, a minimum spanning network was computed using Network 5.0.1.1 to infer cpSSR haplotype relationships with a maximum parsimony search (Polzin & Daneshmand, 2003). Finally, a total of 3 million datasets were drawn from the priors described in Table S2 to compare with the summary statistics of the observed data, although only a subset of the closest 1% of the datasets was used in the calculation of the posterior probabilities. Loci were grouped and defined by a stepwise mutation model (SMM) with a mutation rate ranging from 1 × 10 -3 to 1 × 10 -4 (Cavender-Bares et al., 2011;Dow & Ashley, 1998;Muir et al., 2004). All the summary statistics available in the program were considered when calculating the posterior probabilities. The scenario that best described the data was chosen based on the frequency of each scenario in the simulated datasets that most closely matched the observed data and a logistic regression approach (Beaumont et al., 2002;Cornuet et al., 2008).

| Historical demography
The parameter results related to time were translated into years considering a generation time of 100 years (Gugger et al., 2013).
Confidence in DIYABC scenario choice was evaluated in three ways. All the methods are based on simulating test datasets (pseudoobserved datasets) and estimating their respective posterior probabilities. The first method checks the posterior error rate. It selects the 500 closest datasets to the observed data and uses them as test datasets. Then, it calculates the proportion of times the wrong scenario had the highest posterior probability. The second method uses 1,000 test datasets randomly chosen from the whole prior space of the hypothesized best scenario (X) and calculates the proportion of times in which other scenarios have the highest posterior probability given that scenario X is true (type I error). Similarly, the third method calculates the proportion of times in which scenario X has the highest posterior probability given that a different scenario is true. This proportion is calculated for all the other scenarios (type II error).

| Connectivity analysis
For the purpose of evaluating the geographic connectivity dynamics of Q. humboldtii over time, the following general workflow was followed: (i) First, we used ecological niche modeling to build a cli-  Giorgetta et al., 2013). The climate layers that were used had spatial resolutions of 30 arc seconds (PD and MHol) and 2.5 arc minutes (LGM). To avoid redundancy among the climatic variables during the ENM, a subset of variables was defined using a pairwise correlation test between the variables using R 3.6.1 (R Core Team, 2019). When the values of the correlation were higher than 0.7, the more specific variable of each pair was discarded (e.g., maximum temperature of the warmest month vs. annual mean temperature).
Ecological niche modeling was performed using the maximum entropy algorithm executed in MaxEnt version 3.3.3a (Phillips et al., 2006). The MaxEnt model was implemented using true presence records along the Colombian Andes and five independent climatic variables that remained after the pairwise correlation tests (annual mean temperature, temperature seasonality, temperature annual range, annual precipitation, and precipitation seasonality).
Paleodistribution models of Q. humboldtii in the Colombian Andes during the MHol and LGM were obtained by projecting the present distribution model into six MHol and LGM scenarios (CCSM, MIROC, and MPI for each period). The models were evaluated using the average AUC (area under the curve) as an independent threshold after a 100-replication bootstrap procedure. A subset formed by 30% of the total records was used for training the AUC, and the remaining 70% of the data were used to test the AUC.
The connectivity among habitat patches during each period (PD, MHol, and LGM) was evaluated using the graph-based equivalent connectivity area (ECA) index (Saura et al., 2011). This metric postulates that connectivity occurs inside each habitat patch and among patches and therefore takes into account habitat availability (patch area) and the patch's topological position in a network (i.e., the connections each patch has with the rest of the patches) to obtain the size of a single patch with maximum connectivity that would reflect the same connectivity value as the observed landscape (Saura et al., 2011;Saura & Pascual-Hortal, 2007).
The calculation of ECA requires two inputs: (i) a collection of suitable habitat patches and (ii) the least-cost distances among them. To obtain the first input, a resistance layer was constructed using the average MaxEnt output model for each period and the present-day topographic ruggedness index (TRI), which is the sum of the changes in elevation among neighboring cells (Riley et al., 1999). The niche model for each period was rescaled using a maximum sensitivity test plus specificity logistic threshold to identify the cells with the lowest resistance value, and all the cell values below this threshold were recalculated using the function (100 − (threshold − "value") * 100/threshold) (Zhang et al., 2019).
The topographic ruggedness index was classified into 10 natural breaks using ArcGIS 10.3 (ESRI, 2014) to assign a habitat suitability value to each category.
These two layers were combined into a habitat suitability layer using the Gnarly Landscape Utilities: Resistance and Habitat calculator tool (McRae et al., 2013). Then, the resulting habitat layer was F I G U R E 3 (a) Scheme of three demographic scenarios tested with DIYABC representing the following hypothesis. Scenario 1: constant effective population size, scenario 2: progressive and constant population expansion, and scenario 3: a bottleneck event followed by a recent population expansion (see Table S2 for parameter prior values). Effective population sizes are represented by N, the first subscript represents order in time, and the second represents the scenario in which the parameter was used (i.e., N 11 corresponds to the most recent effective population in scenario 1). Time was represented as t with higher values corresponding to events further in the past. (b) Scaled representation of the results from the best scenario (3)  used to obtain the collection of suitable habitat patches with the Core Area Mapper toolkit in the Gnarly Landscape Utilities toolbox , with an assigned minimum habitat threshold per pixel of 0.5 and a moving window equal to the size of a single cell.
The least-cost distance among the core areas (input ii) was calculated using Graphab 2.4 (Foltête et al., 2012). A complete graph, which calculates all the connections among nodes, was calculated with a cost threshold of 100,000 and a probability of dispersion among patches of 0.5 when the dispersion distance was 1 km.

| RE SULTS
The test for Hardy-Weinberg equilibrium showed that all loci but one (-OM07) showed significant deviations from equilibrium in at least one population. However, deviations were not consistently observed in the same loci across populations (Table S3). The analysis in Micro-Checker suggested the presence of null alleles at all the loci except for quru-GAOC11 and -0E09. Four loci (-IF02, -2F05, IF07, and -OM07) showed a null allele frequency below 8%, and -OI01, -IC08, -OC19, and -OA01 showed null allele frequencies of 0.10, 0.11, 0.21, and 0.22, respectively. The estimations performed separately by sampling location using FreeNa showed, in general, low null allele frequencies (Table S4) Therefore, we concluded that the influence of null alleles on the estimation of genetic structure was low and decided to continue the analysis using all the analyzed loci. Finally, the linkage disequilibrium test detected significant linkage among every pair of loci in at least one sampling location. However, there was no consistent pattern across the sampling locations.

| Genetic diversity
For the 22 sampling locations evaluated, 52 different cpSSR haplotypes with 38 unique haplotypes (present in a single population) and three widely distributed haplotypes were identified (Figure 4).
The number of haplotypes per population ranged from two to eight (Table S6). At the regional level, 14 (8 unique) haplotypes were observed on Cordillera Oriental, 32 (21 unique) on Cordillera Central and 23 (11 unique) on Cordillera Occidental. The within-population genetic diversity (h) varied between 0.154 (sampling location nine; A mean allelic richness per population of 6.095 (0.047) and a mean effective allelic richness of 3.93 (0.0343) were found for the nuclear microsatellites (Table S6). The number of private alleles per population ranged from zero to five. Regionally, the mean allelic richness was 15.6 (36 private) for Cordillera Central, 10.6 (14 private) for Cordillera Oriental, and 12.2 (10 private) for Cordillera Occidental.

| Genetic structure
The AMOVA for the cpSSR haplotypes suggested that considering

| Phylogeographic structure
The genetic differentiation among the sampling locations for the

| Historical population demography
The ABC analyses showed that scenario three was the best model for explaining Q. humboldtii demographic changes, according to the direct estimates (1.0, 95% CI: 1.0-1.0) and logistic regression tests (1.0, 95% CI: 1.0-1.0) ( Table 1). The posterior error rate was 0.000 for both logistic and direct approaches. The type I error rate, meaning the proportion of times scenario three was rejected even though it was true, was 0.099. The overall type II error rate, meaning the proportion of times scenario three had the highest posterior probability even though it was false, was 0.008 and 0.0135 for direct and logistic approaches (Table 1). Scenario three tested the hypothesis of a bottleneck event followed by a recent demographic expansion. The average posterior distribution suggested that the species suffered the bottleneck event 47,500 years BP (q05: 24,000 years; q95: 67,700 years) and a subsequent demographic expansion approximately 28,400 years ago (q05: 9,690 years; q95: 46,100 years) (Table S7; Figure 3b).

| Connectivity
Our estimations of habitat connectivity among the three time periods (LGM, MHol, and PD) showed an increase in the ECA from F I G U R E 4 Distribution of 52 haplotypes identified from cpSSR in sampling locations (pie charts) and haplotype network inferred using neighbor joining in Q. humboldtii. Gray color represents haplotypes that occur in a single sampling location. Single mutational steps are indicated by a gray bar perpendicular to the network linkages, when the number of mutational is higher than one it is indicated by a gray number. Elevation of the Colombian Andes is represented in a grayscale from lowlands (white) to highlands (black) the past to the present ( Figure 6). There were varying results depending on the model used for the calculations. The results based on the CCSM and MIROC models were similar, in contrast with the results based on the MPI model. During the LGM, the lowest ECA value was obtained with the MPI model (2,300 km 2 ), followed by the CCSM (3,600 km 2 ) and MIROC (4,200 km 2 ) models. During MHol, the CCSM and the MIROC models resulted in a slight decrease in connectivity (3,506 km 2 and 4,020 km 2 , respectively), while the MPI model predicted a substantial increase (5,320 km 2 ), and the ECA for Note: Logistic regression posterior probability and 95% confidence intervals (CI) of three demographic scenarios. In the last two columns, it is shown the probability of scenarios one, two, or three of having the highest posterior probability given that scenario three was true (p(Scenario X| Scenario 3)); and the probability of scenario three of having the highest posterior probability even though it was false (p(Scenario 3| Scenario X)). In the inferior part of the table, type I and type II error rates are estimated for scenario 3.

| D ISCUSS I ON
In recent years, complex phylogeographic patterns have been Unfortunately, few studies have aimed to describe the effects of the interplay between topography and historical climate changes on the biological communities in most of the montane ecosystems of the northern Andes, such as montane and submontane forests.
Therefore, our study was directed at increasing the understanding of the population history of trees species from this area. To do so, we developed analyses of genetic diversity, phylogeographic structure, and historical demography and compared the results to the topographical and paleoclimatic history of the region for the widely distributed Q. humboldtii, a lonely oak species that has reached and successfully colonized the Colombian Andes.

| Quercus humboldtii genetic diversity
The nuclear genetic diversity of Q. humboldtii in the Colombian Andes fell within the range of values observed in other oak species (Appendix S2 for methods and Table S8). Surprisingly, our results differed from previous genetic studies on Q. humboldtii (Fernández-M & Sork, 2005, although the comparison should be taken with caution since there is a large difference in the number of loci and the spatial scale with respect to our study. In contrast with the findings of the nuclear diversity, Q. humboldtii's chloroplast genetic diversity was higher than the values found in the oak species of Europe and North America, but was variable with regard to the values found in oak species in Mexico, Central America, and Asia (Table S9).
It has been previously proposed that the high genetic diversity that characterizes several oak species can be explained by intrinsic life-history traits that allow large effective populations to be maintained over time . We observed high nuclear genetic diversity, suggesting that Q. humboldtii was able to sustain sufficiently large populations and accumulate genetic diversity through the climatic oscillations of the Pleistocene. Furthermore, it suggests that a relatively more stable climate may have allowed the accumulation of genetic variation in contrast with the more difficult conditions that temperate species had to endure during glacial periods.

| Quercus humboldtii genetic structure
Genetic structure analyses (AMOVA and tess3r) indicated a low ge-

| Quercus humboldtii responses to climatic fluctuations during the Pleistocene
There are three indications that Q. humboldtii went through a bottleneck in the recent past: the higher expected than observed heterozygosity of the nuclear microsatellites, the star-shaped haplotype network observed in the chloroplast data, and the ABC demographic Quercus gradually decreased (down to 20%) (Hooghiemstra, 2006).
At the highest peak in Polylepis representation, the upper forest limit occurred at 2,600-2,800 m, which indicates the prevalence of cold climatic conditions (van der Hammen & Cleef, 1986;Hooghiemstra, 2006). There is no clear explanation for this change in composition, although it is hypothesized that it was not driven by temperature changes (Hooghiemstra, 2006). Therefore, it is possible that both the observed fluctuation in the pollen record abundance and the demographic changes could have resulted not only from climatic fluctuations but also from competition between Quercus and other dominant genera in high-altitude forest belts.

| Quercus humboldtii historical connectivity
Our results suggest an increase in connectivity (ECA) and an altitudinal displacement of the highly suitable areas for Q. humboldtii from the past to the present (see the results and description of the ENM results in Figure S1and Appendix S3). A similar pattern was found in the ECA of the available surface area for the altitudinal ranges suggested by paleoecological evidence for Andean (UMF) and sub-Andean (LMF) forests during the LGM and the present (Flantua & Hooghiemstra, 2017; It has been previously proposed that distinct phylogeographic patterns might be shown by species at different elevations (Kropf et al., 2003). Therefore, to further test whether the results shown here are a common pattern among mid-elevation species, we suggest that genetic data should be tested against geographically explicit predictions, that is, by calculating the amount of available surface area and the connectivity among the altitudinal bands encompassed by a species' range.

| CON CLUS IONS
Our study on the phylogeography of Q. humboldtii contributes to the understanding of the effect of climatic fluctuations in the late Pleistocene on the evolution of species found in Andean mountain forests. We found a high population genetic diversity and lack of genetic structure, which were explained by the increase in historical connectivity, supporting the prediction of Elsen and Tingley (2015) about the effect of different topographical contexts for shifts in altitudinal ranges. Furthermore, there is evidence of a recent demographic expansion that predates the LGM, which may reflect a change in the dominance among floristic elements of the Andean forests, as has been reported in paleoecological studies for the region (Hooghiemstra, 2006).

CO N FLI C T O F I NTE R E S T
The authors have no competing interests to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
All genetic and geographic data are available in Dryad (https://doi. org/10.5061/dryad.08kpr r528).