Genetic diversity of Cedrela fissilis (Meliaceae) in the Brazilian Atlantic Forest reveals a complex phylogeographic history driven by Quaternary climatic fluctuations

Quaternary climatic fluctuations have shaped the geographic distribution of lineages, potentially affecting the demography, genetic structure, and patterns of genetic diversity of extant species. Different phylogeographic scenarios have been proposed for plants in neotropical cloud forests during the Last Glacial Maximum based on paleoecological data: the dry refugia hypothesis (DRH) and the moist forest hypothesis. We specifically focus on the Brazilian Atlantic Forest (BAF) range of Cedrela fissilis (Meliaceae), sampling 410 specimens from 50 localities. Our study combines analyses of the genetic diversity, phylogeographic patterns, and past geographic distributions with a particular focus on highland populations. We identified 283 alleles across the 11 microsatellite loci, ranging from 18 to 33 alleles per locus, distributed across five genetic groups. Most populations of C. fissilis from the BAF exhibited a diffuse genetic structure, reflected in low pairwise FST values, which could be the consequence of high gene flow. In addition, the plastid data showed a connection between the western, southern, and eastern populations in the North‐East of Brazil, but no association between genetic data and elevation was observed. Habitat suitability projections over the past 140 000 years showed less fragmentation relative to the present, indicating a higher connectivity and gene flow. Our results provide support for both the moist forest as well as the DRH, suggesting that most likely, a mixture of these processes has acted through space and time.


Introduction
Mountain regions around the world are often exceptionally rich in biodiversity and endemism (Kier et al., 2009;Muellner-Riehl, 2019). In South America, the most important orographic features of the Atlantic coast are the mountains of the Serra do Mar and the Serra da Mantiqueira (Almeida & Carneiro, 1998), which have a strong influence on the climatic conditions in the South and Southeast of Brazil. Previous studies reported floristic similarities between these highlands (Bertoncello et al., 2011;Meireles & Shepherd, 2015), high tree species richness (Pompeu et al., 2014), and a high proportion of endemic species (Meireles et al., 2014). The mountains harbor part of the Brazilian Atlantic Forest (BAF) (Oliveira-Filho & Fontes, 2000), one of the 36 currently recognized global biodiversity hotspots with an exceptionally high number of endemic species (about 8000 are vascular plants species and almost 600 are vertebrate species, Myers et al., 2000) and a high level of habitat loss (Myers et al., 2000;Conservation International, 2018). Today, only about 11.4%-16% of the natural vegetation remains (Tabarelli et al., 2005;Ribeiro et al., 2009).
Climatic conditions shape the geographic distribution of species at medium to large spatial scales through time (e.g., latitudinal gradient of species richness, Mannion et al., 2014), with climatic changes potentially affecting their genetic diversity (Stewart et al., 2010;Homburg et al., 2013;Ramírez-Barahona & This is an open access article under the terms of the Creative Commons Attribution NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.  Muellner-Riehl, 2019). Over time, climatic fluctuations can lead to a decline in genetic diversity and ultimately species extinction, or to divergence within species and, potentially, speciation (Hewitt, 2000). The Last Glacial Maximum (LGM, about 28 000 to 24 000 years ago; Hughes & Gibbard, 2015) was one of the largest manifestations of natural climate fluctuations that remains relatively well preserved in the geological record (Mix et al., 2001;Hughes & Gibbard, 2015). During the LGM, global climates were dramatically different from earlier times and from the climate of today (Mix et al., 2001), probably best depicted by the large extent of terrestrial glaciers. Although not representing a typical scenario during the Quaternary climate fluctuations, the LGM may likely have had a major impact on the geographic distribution and/or genetic diversity of species (Silva et al., 2012;Ramírez-Barahona & Eguiarte, 2013).
In South America, early studies suggested that cold and dry conditions in the Amazon region during the LGM would have led to forest fragments that served as refugia isolated by open nonforest vegetation (Haffer, 1969;Hammen & Hooghiemstra, 2000). In the south of the BAF, palynological data show that grasslands dominated the landscape during the LGM where forest ecosystems exist today, indicating a colder and drier climate (Behling, 2002). In addition, Carnaval & Moritz (2008) used palaeoclimatic models and predicted historical forest refugia within the BAF. They found that climatically more stable areas matched the current centers of endemism in five vertebrate species. Furthermore, studies on frog (Carnaval et al., 2009) and bird species (Amaral et al., 2013) suggested a dry refuge area in the BAF that could explain the high presentday biodiversity of this area. Contrary to these studies, some pollen data did not indicate a replacement of forest by grassland vegetation in Amazonia, suggesting that forest cover may have remained stable and not fragmented (Colinvaux et al., 2000;Rull, 2008). In addition, Leite et al. (2016) modeled an expansion of suitable climatic conditions onto the emerged continental shelf during the LGM for five species of small mammals, and suggested that forest refugia played only a minor role, if any, in this region during glacial periods. Also, a study on toads endemic to the BAF suggested the persistence of forested habitats in the south (Thomé et al., 2010) instead of a colonization from northern refugia after the LGM (Carnaval & Moritz, 2008).
Ramírez-Barahona & Eguiarte (2013) reviewed paleoecological records from the Neotropics and proposed two phylogeographic scenarios for cloud forests during the LGM: the dry refugia hypothesis (DRH) and the moist forest hypothesis (MFH). Under the DRH, geographic ranges would have contracted and become fragmented during the cooling phases, resulting in a loss of genetic diversity. Subsequent recolonization of uninhabited areas from small effective population sizes would cause a marked genetic structure. In addition, populations would show genetic differentiation, if they stemmed from different refugial lineages (Ramírez-Barahona & Eguiarte, 2013). In contrast, under the MFH, populations would move downslope to mid or low elevations during cooling phases, while increasing their connectivity. Afterward, during a warming phase, they would move upslope again and experience range fragmentation and little or no demographic expansion. The main consequence of this would be the increase in genetic diversity resulting from spatial heterogeneity and diffuse genetic structuring of populations (see Ramírez-Barahona & Eguiarte (2013) for more details). It should be noted, however, that these scenarios are based on the assumption that the mountain area size generally decreases with elevation (see their figure 2), but because of the heterogeneous topography of mountains, this is not always the case (Elsen & Tingley, 2015). More specifically, Elsen & Tingley (2015; their figure 1) showed that some mountains will have larger areas available at higher elevations, allowing for the possibility of population expansion (and connection) while moving upslope. Given that the mountains of both the Serra do Mar and the Serra da Mantiqueira are hourglass-shaped (sensu hypsographic classification of Elsen & Tingley, 2015), one might expect no general clear-cut genetic pattern, as described by the MFH across different elevational zones in this region. In addition, given the uncertainty concerning the increased aridity during times of lower temperature, it remains unsubstantiated whether cloud forests were contracted into forest refugia during the LGM or not (Ramírez-Barahona & Eguiarte, 2013). In the BAF's mountains, paleoecological and phylogeographic studies have shown agreement with the DRH (Behling, 2002), while others have not accepted the DRH (Thomé et al., 2010;Amaro et al., 2012).
Here, we evaluate the DRH and MFH hypotheses for the Brazilian Atlantic cloud forest using the neotropical tree species Cedrela fissilis Vell. (Meliaceae). Commonly called "cedro branco" or "cedro rosa" (Carvalho, 1994), C. fissilis is a deciduous tree that belongs to a genus with 19 hardwood species Koecke et al., 2015;Huamán-Mera et al., unpublished data) and is widely distributed in seasonal forests and adjacent ecotones as well as gallery forests (Pennington et al., 1981;Carvalho, 1994;. Garcia et al. (2011) explored phylogenetic relationships and the phylogeography of C. fissilis in Brazil and the eastern part of Bolivia. They found that the species consists of two distinct genealogical lineages: one centered in the Chiquitano range and the other one in the Atlantic range, separated by the Cerrado (Brazillian savanna). These patterns were confirmed using 10 microsatellite markers by Mangaravite et al. (2016). In the Atlantic range, populations occur from 20 m to more than 1600 m above sea level (Mangaravite et al., 2016). We specifically focus on the Atlantic range of C. fissilis with a substantially extended sampling and combine analyses of the genetic diversity, phylogeographic patterns, and past geographic distributions to evaluate the DRH and MFH hypotheses.

Sampling strategy
We evaluated 410 specimens from 50 localities of Cedrela fissilis throughout the Atlantic range (Fig. 1). For the population genetic analysis, we genotyped 344 specimens from 21 populations, while we investigated 148 specimens from all 50 localities for the phylogeographic analysis, comprising both new collections and published data (Garcia et al., 2011;Mangaravite et al., 2016;Huamán-Mera et al., unpublished data). The population codes, localities, coordinates, number of specimens, and the references are shown in Table 1 and Table S1. The populations MDP, PEU, VNI, CAP, PSB, PSO, and PNI occur in elevations higher than 600 m a.s.l., hereafter referred to as highland populations (Fig. S1).

Species distribution modeling
Georeferenced occurrence localities of C. fissilis were compiled from field records and published data (Garcia et al., 2011;Mangaravite et al., 2016;Huamán-Mera et al., unpublished data). In total, we used 50 occurrences to model habitat suitability. To alleviate the impact of spatial autocorrelation, we only included records with a minimum pairwise distance of 10 km. The models were built using the maximum entropy algorithm as implemented in MAXENT (Phillips et al., 2006). In particular, we used a 10-fold cross validation approach with 80% of the occurrences for training and 20% for testing. To assess model performance, we used the area under the receiver operating characteristic (ROC) curve (AUC; Hanley & McNeil, 1982). The AUC determines the model's predictive ability compared to a random prediction. In addition, models were validated by assessing the sensitivity, specificity, and accuracy values averaged across the replicates. To estimate habitat suitability, we used bioclimatic variables from WorldClim (Hijmans et al., 2005) at 2.5 arcmin resolution as explanatory variables. From the 19 bioclimatic variables, we selected those with a Pearson correlation of less than 0.71 (Table S2) and as well as a high contribution and permutation importance in the models (Tables S3 and S4). We retained a subset of seven predictors: mean diurnal range (bio02), temperature seasonality (bio04), max temperature of warmest month (bio05), mean temperature of wettest quarter (bio08), precipitation of wettest quarter (bio16), precipitation of warmest quarter (bio18), and precipitation of coldest quarter (bio19).
For modeling, we used the group discrimination technique, which requires both presence and absence data (background) and has been shown to be more reliable than the profile technique (Mateo et al., 2010), which requires presence-only data. Mateo et al. (2010) showed that "target-group absences," instead of the use of random pseudo-absences, is more appropriate. Thus, instead of random, we calculated target-group absences using the three-step (TS) selection procedure as described in Iturbide et al. (2015). Also, we generated twice the number of presences as pseudo-absences for improving model reliability. The first step defines the environmentally unsuitable areas using a presence-only profiling algorithm; the second calculates different maximum distance thresholds to each presence location; and the last step selects the optimum background extent and the corresponding fitted model from all possible pseudo-absence configurations generated in step 2 (for more details see Iturbide et al., 2015). In addition, we included a 10 km of buffer around the collection localities to  (Olson et al., 2001;refer to Table S1 for population codes). avoid grid cells with both presence and pseudo-absence data (Mateo et al., 2010;Iturbide et al., 2015).
The present-day model predictions were projected onto three past climatic layers: the Middle Holocene (MidHol) period about 6000 years ago, the LGM period about 28 000-24 000 years ago, and the last interglacial (LIG) period 120 000 to 140 000 years ago. For the MidHol and LGM, we created an average of three circulation models: CCSM4 (Gent & Danabasoglu, 2011), MIROC-ESM (Watanabe et al., 2011), and the MPI-ESM-P (Earth System Model of Max Planck Institute for Meteorology, Hamburg, Germany; Giorgetta et al., 2013). For the LIG, we employed the data from Otto-Bliesner et al. (2006). We calculated range fragmentation of C. fissilis for all four predictions in R (v.3.3.1, R Core Team 2016), using the package "SDMTools" (VanDerWal et al., 2014). First, we converted the habitat suitability scores (ranging from 0 to 1) into binary data according to different cutoffs (0.4, 0.5, 0.6, 0.7, and 0.8). At each level, we then measured the ratio between the patch perimeter (m) and the area (m 2 ) using the "PatchStat" function.
2.3 DNA extraction, microsatellite genotyping, and gene sequencing Leaf samples were dried using silica gel and kept at room temperature until the DNA was extracted. Total genomic DNA was extracted according to Cota-Sánchez et al. (2006) with modifications (Riahi et al., 2010). Genotyping was carried out using 11 microsatellite loci: 8 markers (Ced2, Ced18, Ced41, Ced44, Ced54, Ced65, Ced95, and Ced1; Table 2) developed by Hernández et al. (2008) for Cedrela odorata, and three markers (CF26 and CF66) developed by Gandara et al. (2014) for C. fissilis (Table 2). Primer pair CF66 amplified two loci-CF66A and CF66B, respectively (Gandara et al., 2014); distinct ranges of allele sizes (113-167 bp for CF66A; 199-253 bp for CF66B) allowed us to clearly differentiate the genotypes of each of the two loci ( Table 2). The polymerase chain reaction (PCR) was performed in two multiplexes-when we amplified more than one marker in the same PCR, M1 (Ced54-Ced41) and M2 (Ced2-Ced65), then two other primers were amplified separately (Ced95 and Ced131) and combined with the M1 and to M2, respectively, to be genotyped together to reduce costs. Also, two primers were amplified separately, but combined later for The trnT-trnL spacer from the chloroplast genome was amplified using standard PCR protocols (Muellner et al., 2009;Oliveira et al., 2010). The spacer was amplified in two independent reactions. The first reaction was performed using the primers "a" and "b", which amplify the intergenic spacer between trnT (UGU) and the 5′ exon of trnL (UAA) (Taberlet et al., 1991). The second reaction was performed with primers "c" and "d" and amplified the intron of trnL (UAA) (Taberlet et al., 1991). The PCRs were carried out with 25 µL of final volume, with about 100 ng of DNA, 1× of the special buffer IVB (Phoneutria Biotecnologia, Belo Horizonte, Brazil), 0.5 μmol/L of each primer (forward and reverse), 2 mmol/L of MgCl 2 , 17.5 μg/mL of BSA (Invitrogen, São Paulo, Brazil), 0.2 mmol/L of dNTPs, and 1.25 U of GoTaq DNA Polymerase (Promega, Belo Horizonte, Brazil). Amplifications were carried out on a Vapo Protect Eppendorf Thermal Cycler (Eppendorf, São Paulo, Brazil) following the amplification program with a preheating step of 94°C for 2 min, 36 cycles of 94°C for 1 min, 58°C for 1 min, and 72°C for 1 min; in the end, the extension step was at 72°C for 5 min. All the PCR products were cleaned using 0.75 U of both enzyme EXO (Exonuclease I, Affimetrix, São Paulo, Brazil) and SAP (Shrimp Alkaline Phosphatase, Affimetrix, São Paulo, Brazil) in the proportion of 15 μL of PCR to 3 μL of the enzymes (diluted in 0.05 mol/L Tris, pH 8.0). Each reaction was incubated at 37°C for 45 min to degrade the remaining primers and nucleotides, followed by one step at 80°C for 20 min to inactivate the ExoSAP. Sequencing was performed by Macrogen Inc. (South Korea), using the same primers as in the PCR amplifications. Sequences were edited and aligned using the program Sequencher 4.8 (Gene Codes Corporation, http://www.genecodes.com), using the default settings. In addition, the ReAligner (Anson & Myers, 1997) option was used to optimize gap placement. Gaps longer than three sites in the alignments were coded as insertions/deletions (indel), and different length indels were coded as different evolutionary events, following previous methodologies (e.g., Garcia et al., 2011;Zelener et al., 2016). We did not use indels that were bordered by mononucleotide repeats (e.g., polyA) as a source of information in subsequent statistical analyses to avoid effects that can arise from experimental errors or evolutionary lability associated with this type of indel (Mast et al., 2001).

Data analyses
For the microsatellite data, the populations with small sample sizes (n < 9; populations MDP, CON, SOO, PRD, FR2, CAM, and SMA) were included only in analyses that did not require an a priori identification of populations (as in Structure-see below). Initially, we estimated the null allele frequencies for each marker with the program Micro-Checker, version 2.2.3 (Van Oosterhout et al., 2004). Further, we used the software FSTAT, version 2.9.3.2 (Goudet, 2002) to calculate the statistical significance of deviation from the Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium between loci in all populations. In order to describe the populations, we ran the GDA software (Lewis & Zaykin, 2002) and presented, for each population, the number of specimens (n), number of alleles per locus (A/locus), number of private alleles (A PRIV ), number of private alleles per specimens (A PRIV /n), expected heterozygosity (H E ), observed heterozygosity (H O ), and inbreeding coefficient (F IS ). Concerning the diversity among populations, we estimated two indexes: F ST (Weir & Cockerham, 1984) and R ST (Slatkin, 1995;Goodman, 1997). The FSTAT software (Goudet, 2002) estimated both F ST and R ST over all samples. To test the significance of each estimate, we used 1000 permutations, with 99% confidence intervals. The F ST index assumes the infinite alleles model, while the R ST is based on allele size (stepwise mutation model, SMM). Estimates of R ST may be useful when mutations contribute substantially to allelic differences among populations, resulting in greater insights into the patterns of relationships among populations than provided by F ST (Holsinger & Weir, 2009). We also computed the pairwise F ST of populations using ARLEQUIN (Excoffier et al., 2005). The P-value of the test is the proportion of permutations leading to an F ST value larger or equal to the observed one (Manual of ARLEQUIN), we used 1000 permutations, with 95% confidence intervals. In Arlequin, we selected "Slantkin's distance" and instead of "Number of different alleles (F ST )" we activated the option "Sum of squared size difference (R ST )," which allowed us to use more information for each locus. In order to test whether gene flow or ancestral polymorphisms is the most important evolutionary force, we performed a Mantel test (Mantel, 1967) using the R package "ade4" (Jombart, 2008). If gene flow is an important evolutionary force, then the genetic differentiation of close population pairs should be lower than for those population pairs at larger geographic distances (Muir & Schloetterer, 2005). Thus, the R ST would be correlated to the geographic distance between localities.
For the next genetic structure analysis, we included all 21 populations. The Bayesian clustering approach of software Structure version 2.3.4 (Pritchard et al., 2000;Hubisz et al., 2009) inferred the number of genetic groups of C. fissilis using the Monte Carlo Markov Chain (MCMC) approach. We set runs with a burn-in period of 250 000 steps followed by 750 000 steps, with 20 independent replications. The structure assumes a model in which there are K genetic groups, each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to one or more genetic groups, if their genotypes indicate that they are an admixture (Pritchard et al., 2010). As Evanno et al. (2005) suggested, we set the K from 1 to 24, which is from 1 to the number of sampling sites (21) plus 3. To find the best K, we used the ΔK method (Evanno et al., 2005), as implemented in Structure Harvester (Earl & VonHoldt, 2011). The software CLUMPP (Jakobsson & Rosenberg, 2007) permuted the clusters output by independent runs of STRUCTURE and matched up the data of the 20 iterations into one: the best K. Later, we graphically displayed the results using the software DISTRUCT (Rosenberg, 2004). For each population, we summed the membership coefficients for all samples to obtain a diagram depicting the relative contribution of each lineage to the contemporary gene pool of that population.
In order to assess the diversity of the sequences obtained, the program DnaSP 5:10 (Librado & Rozas, 2009) determined the haplotypes in the data set. Gene genealogies based on the coalescent theory were inferred using a Median Joining haplotype network (Bandelt et al., 1999) as implemented in NETWORK 4.5.1.6 (www.fluxus-technology.com). Changes in the effective population size through time were inferred using a Bayesian Skyline approach (Ho & Shapiro, 2011) as implemented in BEAST v.2.4.4 (Drummond et al., 2012) using the cpDNA data. The analysis was run for 100 generations (to ensure sufficient sampling [ESS > 200]) using the HKY nucleotide substitution model and a strict molecular clock (5E-9 s/s/y). The first 10% of samples were discarded as burn-in prior to the Skyline reconstruction in Tracer (v.1.7).

Habitat suitability modeling
Climate-based distribution models of Cedrela fissilis, built with current conditions, exhibited the best distance threshold of 860 km (AUC = 0.808). The models indicated the presence of fragmented highly suitable areas in the south of Brazil, as well as throughout the Atlantic coast. Habitat suitability varied substantially over the past 140 000 years (Fig. 2). High suitability during the LIG was predicted in a large area in southern Brazil ( Fig. 2A, area 1) and in two smaller areas in the Paraná ( Fig. 2A, area 2) and Amazon basin ( Fig. 2A, area 3). Lower suitability was found in the Northeast during the LIG (Fig. 2A). During the LGM, the highest suitability shifted north and westward relative to the LIG projection (Fig. 2B). Also, high suitability on the then-exposed continental shelf was observed, especially at the Abrolhos Bank and the coast of the northeast (Fig. 2B). At the MidHol, similar to the present (Figs. 2C, 2D, respectively), high suitability regions were scattered south and eastward in Brazil, and also a small area in the north-east. These four time periods exhibited increased fragmentation through time toward the present day. For every cutoff (0.4, 0.5, 0.6, 0.7, and 0.8), the habitat fragmentation was higher in the present, followed by MidHol, LGM, and the LIG (Fig. 3).

Genetic diversity and structure
We identified 283 alleles across the 11 microsatellite loci; the number of alleles per locus ranged from 18 (Ced54) to 33 (Ced44). None of the 11 loci exhibited high null allele frequency (Table 2). Also, the 11 loci were at the Hardy-Weinberg equilibrium and linkage equilibrium. Fourteen populations (with population size ≥9) exhibited 61 private alleles (A priv ), ranging from 0 (SER) to 9 (PSO). When we divided A priv by the number of samples, we picked up the A priv regardless of the different population sizes; the highest A priv /n occurred in the populations UBE (A priv /n = 0.37) and PSO (A priv /n = 0.36 The genetic structure of the 21 populations (also including populations n < 9) was placed along the BAF (Fig. 4A) and assigned to five genetic groups (K = 5, Figs. 4B, 4C). Populations with a majority of genetic group 1 (PSO, PNI, CAM, and BLU) were mainly distributed along the coast of the southeast ("Serra do Mar" in dark blue, Fig. 4A) with one exception (CON), which can be found further north in close proximity to populations from other genetic groups (SOO and PRD). Populations CAM, BLU, and CON belong to low elevations, while the populations PSO and PNI are from highlands. Populations for which the majority genetic group is group 2 ("Central-West" in light blue, Fig. 4A) are widely spread in central Brazil, occurring both in lowlands (ITA, FR2, UBE, DIA, and PRD) and highlands (PSB and PEU). Three populations from the lowland (PLO, TBA, and SMA) exhibited major assignment to genetic group 3 ("South" in yellow, Fig. 4A). The highland population PNI was also associated with group 3. Genetic group 4 ("Serra do Caparaó" in green, Fig. 4A) occurred especially in highland populations, with a high proportion in CAP and VNI, and also in the populations PRD and MDP. The latter population was the one with the highest level of admixture. Finally, the genetic group 5, ("Northeast" in turquoise, Fig. 4A), occurred in three lowland populations (JAN, SER, and SOO).

Phylogeographic patterns
The alignment of 1370 bp revealed the presence of 20 polymorphic sites, and indels or substitutions in 148 individuals. We observed 19 substitutions, 5 transitions, and 15 transversions. The remaining polymorphic sites were represented by an indel of seven base pairs. The analysis of 148 specimens of Cedrela fissilis revealed 16 haplotypes (Fig. S2).  Every haplotype was placed on a map (Fig. 5A), and the seven genealogical lineages (hereafter haplogroups) were depicted in a haplotype network (Fig. 5B). The central haplogroup (H1, in blue) was the most frequent. This haplogroup was represented by 66 specimens from 22 populations widely scattered in the Atlantic range. Haplogroups H4 (yellow) and H2, H3, and H6 (orange) exhibited an overlapping distribution in the south-west of Brazil, with the exception of population SOO, which occurred at the east coast. Haplogroups H7 and H14 (in lilac) and H5 (red) showed geographic overlap in the north-east of Brazil. The red haplogroup was connected to the central one (blue) by haplotypes from population VNI (H16). The lilac haplogroup were connected to the central one by haplotypes from population PSB (H13). The green haplogroup (H8) was the farthest from the central haplogroup, separated by six mutational steps. This haplogroup showed evidence for a connection between the western and the eastern populations through the north-east. Five populations were very close geographically (in the western area), although each one belonged to a different haplogroup (CEC, FR2, OUR, CED, and SRP). Similar to the genetic structure of the populations, the haplogroups were not associated with elevation. The population demographic analysis indicated little changes of the effective population size (Bayesian Skyline plot, Fig. S3). This result corroborates findings of Mangaravite et al. (2016), who tested for heterozygosity excess in microsatellite data of C. fisslis and found no significant heterozygosity excess (which would be indicative of a genetic bottleneck) in any of the populations.

Discussion
In this study, we used species distribution modeling and two molecular markers (microsatellites and cpDNA) to assess the late-Quaternary evolutionary dynamics of the tree species Cedrela fissilis (Meliaceae) in the BAF and beyond. Our results provide support for both the moist forest as well as the DRH, suggesting that most likely a mixture of these processes have acted through space and time.

Moist forest hypothesis and connectivity during the LGM
The molecular data were largely congruent (between microsatellites and cpDNA), and minor differences may be A B C Fig. 4. Cluster analysis and geographic distribution of the five genetic groups of Cedrela fissilis. A, Geographic distribution and lineage assignments of all populations. Pie charts represent the mean of each membership coefficient for all samples from that population. B, The ΔK statistic, based on the second order rate of change of the likelihood function, identified K = 5 as the best-fit number of genetic groups. C, Plot of the cluster analysis in STRUCTURE. Each horizontal bar represents an individual, colors indicate a membership coefficient that represents the fraction of the genome that has ancestry in the respective genetic group (refer to Table S1 for population codes).
due to higher mutation rates and high levels of polymorphism in the microsatellites (Schlötterer, 2000;Zane et al., 2002). Most of the populations of C. fissilis from the BAF exhibited a diffuse genetic structure, which could be the consequence of a range expansion and higher connectivity in the past, as suggested by the MFH (Ramírez- Barahona & Eguiarte, 2013). Some populations in the Atlantic coast close to the Abrolhos Bank (CON, SOO, and PRD) were highly admixed and seem to be a relic of an expansion onto the continental shelf during the LGM. Indeed, our predictions for the LGM show less fragmentation relative to the present, indicating a higher connectivity and gene flow, which has also been shown for marsupials (Leite et al., 2016). A low spatial genetic structure due to past connectivity was further reported in populations of the shrub Myrtus nivellei (Myrtaceae) in mountains in the Sahara Desert (Migliore et al., 2013) and in the montane salamander Pseudoeurycea leprosa in the Trans-Mexican Volcanic Belt (Velo-Antón et al., 2013). In the BAF, the weak genetic structure is not exclusive to C. fissilis. A similarly weak genetic structure was reported in birds and explained by the occurrence of bottleneck events (Batalha-Filho et al., 2012;Cabanne et al., 2013). However, for C. fissilis, bottlenecks are unlikely to have occurred given the homozygote excess observed here and in Mangaravite et al. (2016). Moreover, C. fissilis exhibits high variation within populations, even higher than the variation among the ranges of two distinct genealogical lineages (AMOVA analysis, Mangaravite et al., 2016). The diffuse genetic structure of C. fissilis populations is reflected in low pairwise F ST values (with many values <0.05). Even though STRUCTURE correctly infers the genetic assignment when genetic differentiation among groups is low, only levels of genetic differentiation higher than 0.05 attain an accuracy rate of over 97% (Latch et al., 2006). This may explain the fact that populations VNI and SER (both with higher F ST values) were the two populations with the least admixture in our STRUCTURE analysis. The BAF biome harbors such a complex topography (Ab'Saber, 2005) that some studies reported, in contrast to our findings, the presence of populations with deep genetic structure (e.g., in toads; Thomé et al., 2010). In our study, some populations of Cedrela fissilis exhibited only little admixture (SER, JAN, UBE, PEU, CAP, and PSB), thus being genetically more structured than those mentioned above. Except for three populations (SER, JAN, and UBE), this strong genetic structure was found in highland populations, suggesting isolation by elevation, as found in a dominant tree, Castanopsis eyrei (Fagaceae), in the Nanling Mountains, China (Shi et al., 2011). However, in contrast to the populations of the C. eyrei, where gene flow among populations of similar elevations exceeded gene flow along an elevational gradient on individual mountains (Shi et al., 2011), highland populations of C. fissilis belonged to different genetic groups even in geographic proximity (e.g., PSB and A B Fig. 5. Median-joining network and the geographic distribution of the 16 haplotypes of Cedrela fissilis. A, The geographic occurrences of the populations and their lineage assignment. Each pie chart represents one population, the colors depict the haplogroup which the population belongs to (see Table S1 for population codes). The size of the pie charts indicates population sizes. B, Median-joining network using trnT-trnL haplotypes. The network shows the relationships among different genealogical lineages of C. fissilis. Each circle represents one haplogroup (coded from H1 to H16), sizes are proportional to their relative frequencies. The number of mutational steps is indicated by small bars in case more than one step occurred.
CAP). This distinct genetic structure observed in C. fissilis could be the result of three processes: (i) differentiation by genetic drift associated with upslope migration (with little to no evidence of present-day connection), (ii) long-distance dispersal onto the highlands from distinct source areas, or (iii) recolonization from nearby refugia. While some of these processes may have acted together, our data suggest that dispersal from different sources is most likely. Dispersal ability is a key factor that influences gene flow (Freeland, 2005), and the level of admixture among populations of C. fissilis reflects gene flow likely by dispersal across the entire Atlantic range. Indeed, long-distance dispersal has been an important process that shaped not only the current distribution of C. fissilis (Garcia et al., 2011;Mangaravite et al., 2016), but the entire genus from deep time to the present Koecke et al., 2013). The low habitat suitability in the north-east of the Brazilian Atlantic coast during the LIG together with the strong genetic structure (genetic group 5 and haplotypes H5, H7, and H14) and high genetic divergence (see F ST values of SER population) of the northeastern populations suggest that long-distance dispersal to this region might have occurred following the late-Quaternary climatic fluctuations. Similar patterns of low admixture in the north-eastern area are known in frog species (Carnaval et al., 2009), birds (D'Horta et al., 2011Maldonado-Coelho, 2012), spiders (Peres et al., 2017), and in wasp populations (Menezes et al., 2017). D'Horta et al. (2011) and Menezes et al. (2017) explained these patterns by a recent bottleneck effect, but our data suggest that the north-eastern populations harbor a combination of lineages that probably arrived during or after the LGM through different dispersal routes: one from inland (Amazonia), and another one from the coastal south-east. These two geographical routes were referred to as the "Peri-Amazonian" and the "Parana Atlantic" vegetation gradient, respectively, based on the analysis of the spatial distribution of 32 common tree species in South America (Spichiger et al., 2004). The phylogeographic connection between the FR2 population and the northeastern populations could constitute a relic of such a "Peri-Amazonian" dispersal. Indeed, FR2 and populations in the center of Brazil exhibit a high admixture pattern from either the Atlantic or Chiquitano range (Mangaravite et al., 2016;Diaz-Soto et al., 2018;Huamán-Mera, unpublished data). The "Parana Atlantic" route may have a high impact on the genetic diversity found in the northern C. fissilis populations, because the network analysis indicated two independent connections between the north-eastern and the Atlantic coast, suggesting independent dispersals. Moreover, the SER population (Fig. 4) and TOB and POV populations (Fig.  5) belong to the Atlantic range (rather than the Chiquitano range; Mangaravite et al., 2016). Finally, SER is the only population with no private alleles, which suggests that this population might not have been isolated long enough for its divergence. Others have argued that northern populations persisted in a refugium that allowed for their diversification, such as the "Pernambuco refugium" in frogs (Carnaval et al., 2009) or the "Caatinga" ancient refugium observed in palms (Souza et al., 2018). In contrast, we argue that in C. fissilis longdistance dispersal is more important for the genetic structure of the north-east region than a local refugium and its distinctiveness is likely due to a unique combination of alleles that exist at the intersection of two different dispersal routes.

Dry refugia hypothesis and the refugia concept
The ecotonal region of the BAF and the Cerrado in the Paraná basin is another highly diverse area for Cedrela fissilis (Figs. 4,5) that might be the result of long-distance dispersal. This appears to be another intersection between the Central-West and South of Brazil due to the likely expansion during the LGM, which resulted in the advance of dry vegetation restricted to ecotonal conditions (Arruda et al., 2017). At that time, the Paraná basin was a humid region (Spichiger et al., 2004) and was suggested as having acted as a refugium due to climate stability (Carnaval & Moritz, 2008;Carnaval et al., 2009), leading to higher genetic diversity within and among populations compared to less stable areas (Carnaval et al., 2009). It should be noted that the term "refugium" usually refers to contractions of the geographic range in conjunction with reductions in abundance (Bennett & Provan, 2008;Stewart et al., 2010;Keppel et al., 2012). This reduction results in a loss of genetic diversity and a marked genetic structure as suggested by the DRH (Ramírez- Barahona & Eguiarte, 2013). In this sense, we would not apply the term refugium to the Paraná basin area, while not excluding the stability attribute. Gallery forests in the Paraná basin might have acted as a buffer and sustained forest-adapted lineages in dry regions (Fine & Lohmann, 2018), promoting the survival of drought-tolerant and generalist species, such as C. fissilis (Spichiger et al., 2004). However, instead of contraction, we argue that these populations experienced increased connectivity allowing an admixture genetic structure during the LGM. This does not reject the drier and cooler climate reported for the Southeast based on palynological data (Behling, 2002), as C. fissilis shifted its distribution toward central Brazil, and returning southward after the LGM. This is further corroborated by three additional findings: (i) high levels of admixture between the two genealogical lineages (Mangaravite et al., 2016), (ii) the presence of a new lineage in another ecotone, also between the Cerrado and the BAF (Diaz-Soto et al., 2018), and (iii) the discovery of a new species of Cedrela from the center of Brazil (Huamán-Mera et al., unpublished data). Additional sampling in the center of Brazil is required to increase the genetic structure resolution and to obtain a thorough history of the Paraná basin and its importance for the diversity in the BAF.

Conservation and future perspectives
Global warming impacts animal and plant populations, and will continue to do so in the future (Root et al., 2003). One potential future impact on Cedrela fissilis is increased fragmentation of suitable habitats. This fragmentation may lead to a substantial loss of genetic diversity, and put populations at the risk of extinction (Bálint et al., 2011), especially those occurring in highlands (La Sorte & Jetz, 2010;McCain & Colwell, 2011). Cedrela fissilis is classified as "vulnerable" in the Red List of the International Union for Conservation of Nature (IUCN), and its populations are declining (Barstow, 2018). Both these facts make C. fissilis an important focal species for studying the impact of climate change on tropical trees. Modeling the future distribution for C. fissilis will help to better forecast the climatic impact on this species, and hopefully help guide conservation measures.

Supplementary Material
The following supplementary material is available online for this article at http://onlinelibrary.wiley.com/doi/10.1111/jse. 12545/suppinfo: Table S1. Populations of Cedrela fissilis sampled for phylogeographic analyses (n = number of individuals). Table S2. Pairwise Pearson correlation coefficients among all 19 bioclimatic variables. Labels and correlation coefficients of the seven selected predictor variables are in bold (see Table  S3 for the full name of each variable). Table S3. Percentage contribution and permutation importance in the Maxent models for all 19 variables (the subset of seven predictors are highlighted in bold). Table S4. Percentage contribution and permutation importance for the retained subset of seven predictors in the final Maxent model. Fig. S1. Digital elevation model of eastern South America with the locations of sampled populations of Cedrela fissilis (see Supporting Table S1 for population codes). Fig. S2. trnT-trnL haplotypes. Out of 1370 sites, 20 were polymorphic in the two cpDNA fragments (fragment AB and fragment CD) of the 16 haplotypes (H1 to H16) from 148 specimens of Cedrela fissilis. Dots indicate similarity to haplotype 1, hyphens indicate gaps. Numbers on top indicate the nucleotide position with haplotype 1, acting as the reference sequence during alignment. The number (#) indicates the frequency of the haplotypes. Fig. S3. Demographic history of C. fissilis estimated using Bayesian skyline plots from cpDNA data. The black line represents the mean estimate of the effective population size (Ne, log-scale) through time; the blue area indicates the upper and lower 95% highest posterior density (HPD) limits of Ne.