Tropical forest loss and geographic location drive the functional genomic diversity of an endangered palm tree

Abstract Human activity has diminished forests in different terrestrial ecosystems. This is well illustrated in the Brazilian Atlantic Forest, which still hosts high levels of species richness and endemism, even with only 28% of its original extent remaining. The consequences of such forest loss in remaining populations can be investigated with several approaches, including the genomic perspective, which allows a broader understanding of how human disturbance influences the genetic variability in natural populations. In this context, our study investigated the genomic responses of Euterpe edulis Martius, an endangered palm tree, in forest remnants located in landscapes presenting different forest cover amount and composed by distinct bird assemblage that disperse its seeds. We sampled 22 areas of the Brazilian Atlantic Forest in four regions using SNP markers inserted into transcribed regions of the genome of E. edulis, distinguishing neutral loci from those putatively under natural selection (outlier). We demonstrate that populations show patterns of structure and genetic variability that differ between regions, as a possible reflection of deforestation and biogeographic histories. Deforested landscapes still maintain high neutral genetic diversity due to gene flow over short distances. Overall, we not only support previous evidence with microsatellite markers, but also show that deforestation can influence the genetic variability outlier, in the scenario of selective pressures imposed by these stressful environments. Based on our findings, we suggest that, to protect genetic diversity in the long term, it is necessary to reforest and enrich deforested areas, using seeds from populations in the same management target region.


| INTRODUC TI ON
To meet the demands of an ever-increasing population, humans have drastically altered the environment, converting natural landscapes into human-modified landscapes (Curtis et al., 2018).
These anthropogenic actions cause a rapid decrease and fragmentation of native forests, leading to changes in the natural dynamics of these ecosystems and posing a major threat to biodiversity (Arroyo-Rodríguez et al., 2020;Curtis et al., 2018). In particular, habitat loss has affected tropical forests, threatening their rich associated biota (Arroyo-Rodríguez et al., 2020;Morante-Filho et al., 2015;Rocha-Santos et al., 2016). Several studies have reported that this disturbance in tropical forests cause the loss of species (Lima & Mariano-Neto, 2014), ecological processes (Moran & Catterall, 2014) and genetic variability (González et al., 2020).
The assessment of the genetic variability in natural populations is extremely important from ecological, evolutionary and conservation perspectives in the face of constant anthropogenic disturbances (Brancalion et al., 2018;Carvalho et al., 2017;Lanes et al., 2018;Santos et al., 2015), since this variability is the raw material on which natural selection can act, allowing species to adapt to these disturbances over time (Orr, 2005;Wright, 1931). Genetic investigations using genomic approaches allow a broader understanding of how such disturbances influence the genetic variability in populations in different landscapes. Besides, they are essential for proposing conservation strategies (Ćalić et al., 2016;Fischer et al., 2013;Rellstab et al., 2015;Schoville et al., 2012). These approaches have great inferential power, because they are based on the evaluation of thousands of sites in the genome at once in a target species, while other techniques use only a limited number of markers (Bragg et al., 2015;Ćalić et al., 2016;Manel et al., 2010). Genomics also can focus on genetic variation that is either neutral (i.e., it does not influence the individual fitness) or adaptive (i.e., it does influence the individual fitness) Manel et al., 2010). In the first case, studies are mainly interested in gene flow and genetic drift in landscapes Storfer et al., 2010). On the other hand, the studies of adaptive genetic variation associate loci under selection (i.e., outlier loci) with environmental variables (Ahrens et al., 2018;Parisod & Holderegger, 2012;Steane et al., 2014), for example, describing genetic variation as a response to specific environmental features. Therefore, population genomics may contribute to the understanding of the genetic adaptation and species conservation in a rapid changing scenario induced by human activities (Benestan et al., 2016;Fitzpatrick & Keller, 2015;Forester et al., 2016;Sork, 2018).
In the context of tropical forests, population genomics approaches are especially relevant for the Brazilian Atlantic Forest.
This forest domain has high species richness and endemism (Martini et al., 2007;Myers et al., 2000;Thomas et al., 1998) but is currently reduced to less than 28% of its original extent, with highly fragmented and disturbed remaining forests (Rezende et al., 2018). Therefore, understanding the consequences of anthropic disturbances on the genomic variability in species found in this biome is important to design and implement efficient forest conservation strategies (Santos & Gaiotto, 2020).
We selected the native palm Euterpe edulis Martius (Arecaceae) as a biological model to understand the effects of landscape-scale forest loss and fragmentation on the genomic variability. This species occurs throughout the Brazilian Atlantic Forest and has great ecological importance since its fruits are consumed by approximately 58 species of birds and 20 species of mammals. It has being considered a key species for maintaining fauna in periods of resource scarcity (Galetti et al., 2013). In economic terms, E. edulis has one of the most widely used (i.e., overexploited) nontimber products in the Brazilian Atlantic Forest. Its apical meristem (heart of palm), usually obtained by illegal extraction, is very desired for human consumption (Brancalion et al., 2012;Muler et al., 2014).
Heart of palm extraction causes death to adult individuals and, consequently, a reduction in their populations, which together with the loss and fragmentation of their habitats, has included E. edulis on the endangered Brazilian flora list (Martinelli & Moraes, 2013).
In fact, demographic studies demonstrate local extinction of this species in severely deforested landscapes (Leal et al., 2021) and low growth and survival potential of young individuals in deforested landscapes under high light penetration (Cerqueira et al., 2021).
From a molecular point of view, studies using neutral microsatellite markers (i.e., with no influence on individual fitness) have shown that the loss and fragmentation of forests can lead to the collapse of gene flow between populations and increase the inbreeding coefficient (Carvalho et al., 2016(Carvalho et al., , 2017Pereira et al., 2022;Santos et al., 2016). Even located in selectively neutral locations, these markers are considered representative of the whole genome. Therefore, they indicate what is happening with the population in terms of genetics. In addition, forest loss can have indirect effects on this species, reducing the richness and/or abundance of birds that disperse its seeds leading to microevolutionary and phenotypic changes (Carvalho et al., 2016;Galetti et al., 2013). Synergistically, this dispersers loss can reduce the genetic variability in remaining populations limiting gene flow by seeds . In addition, populations inserted in different biogeographic regions, such as semideciduous forest vegetation and rainforest, have different gene pools, due to the distinct selective regimes imposed by regions (Carvalho et al., 2016).
In the present study, we aimed to investigate the patterns of variability and genetic structure in E. edulis populations, using SNP markers inserted in transcribed regions of genome, which may help to understand the local adaptation of this important species. Using this approach, we predicted that (i) genetic structure and variability of populations reflect regional characteristics, due to the different selection regimes imposed by the biogeographic regions and/ or different deforestation histories; (ii) landscape forest loss as well as reduction in the richness and abundance of seed-dispersing birds reduce the neutral genetic variability in populations due to genetic drift and a decrease in gene flow; and (iii) changes in environmental or ecological conditions (e.g., a reduction in the richness and abundance of seed dispersers or loss of forest) may impose selective pressures, favoring the detection of genes under selection (i.e., outliers) in E. edulis populations.

| Study area
The study was performed in remnants at the Atlantic Forest located in the states of Bahia (BA) and São Paulo (SP), Brazil (Figure 1).
In Bahia, we sampled in 16 forest fragments located in the southern of this state, comprising one of the largest remnants of the Atlantic Forest in northeastern Brazil (Rezende et al., 2018;Thomas et al., 1998). This area is considered a high priority for conservation due to the high richness and endemism, mainly of tree species (Martini et al., 2007). These 16 fragments are distributed in two regions with similarities in flora, topography, and soil type (Thomas et al., 1998), but differing in the amount of remaining forest cover ( Figure 1; for more details, see Morante-Filho et al., 2016). In the first region (BA_NR), 50% of the native forest remains, comprising large and continuous forest fragments, such as the Biological Reserve and the Una Wildlife Refuge (together representing 34.804 ha of protected area), and a matrix dominated by cocoa agroforestry. The second region (BA_SR), contrasted drastically with BA_NR, with only 30% of the native forest remaining and consisting of a matrix dominated by open areas . So, although the deforestation process in Bahia started in the mid-1980s but was intensified in the 1990s (Rocha, 2006), deforestation started in BA_SR and then moved toward BA_NR, that is, in BA_SR deforestation is older.
The Atlantic Forest in SP has a history of deforestation for planting monocultures, such as coffee, that dates back to the 19th century F I G U R E 1 Geographic distribution of the 22 populations of Euterpe edulis sampled in fragments of the Atlantic Forest located in the states of Bahia and São Paulo. (a) Southern Bahia, northern region (BA_NR region; red sites); (b) southern Bahia, southern region (BA_SR region; blue sites; (c) São Paulo, rainforest (SP_RA region; yellow sites) and semideciduous forest (SP_SE region; purple sites). The numbers correspond to the identification of each population within their respective region (see Table S1 for more details). (Dean, 1996). In this state, we sampled six fragments of Atlantic Forest characterized by semideciduous forest vegetation (SP_SE region) and rainforest (SP_RA region). These regions differ in the plant species composition as well as in the prevailing environmental conditions (Eisenlohr & de Oliveira-Filho, 2015). For example, while the SP_SE region is in the interior of the state and has a sharp dry season and nutrient-rich soils, the SP_RA region is at the coast, does not experience rainfall limitations throughout the year, and has poor nutrient soils (Eisenlohr & de Oliveira-Filho, 2015).

| Sampling design
For the southern region of BA, we analyzed a land use map based on high-resolution satellite images (Quick Bird and World View, both from 2011, and Rapid Eye from 2009 to 2010) that covered 3470 km 2 at a scale of 1: 10,000. We identified 58 potential sampling sites located in landscapes that showed great variation in the amount of remaining forest. Then, we selected 16 forest sites with a minimum distance of 1 km from each other ( Table S1); nine of these sites are located in BA_NR region and other seven sites are in BA_SR region.
Based on previous studies Soares et al., 2019), all site present occurrence data of E. edulis. Based on this mapping approach, native forest cover (%) within a landscape radius of 2 km from the center of each selected forest location was calculated using the Quantum GIS 3.20.3 program (QGIS Development Team, 2021).
Each landscape was defined by a radius of 2 km, considering the foraging distance of the large dispersers of E. edulis seeds less than 600 m and that the foraging distance of important pollinators of this palm tree presents a distance of less than 1 km (Holbrook, 2011;Zurbuchen et al., 2010). Furthermore, several genetic studies have successfully used the same landscape sizes to assess the effects of forest loss in this palm species (Carvalho et al., 2015(Carvalho et al., , 2017Santos et al., 2015Santos et al., , 2016. For the areas sampled in SP we used information from previous studies to identify areas with E. edulis (Carvalho et al., 2015), and six sampling points were selected. There was great variation in the amount of forest at these points and a minimum distance of 1 km between the points. Subsequently, we used high-resolution images from Google Earth version 7.3.2 through the Open Layers plugin in Quantum GIS 3.20.3 program (QGIS Development Team, 2021) to visually classify the forested and nonforested parts of the images.
Thus, the percentage of forest in each area was quantified, and the same procedures were followed in the sampled locations in BA.

| Focal species and sampling
Euterpe edulis is a monoecious species, with a preferentially allogamous reproductive system (i.e., fertilization occurs when pollen from one plant fertilizes the flower stigma of another plant), although it has self-compatibility (Gaiotto et al., 2003). Its flowers are visited by several small insects, with Meliponini and Euglossini bees being the main pollinators (Reis et al., 2000;Santos et al., 2018). This palm bears annual fruiting, with seeds dispersed mainly by large frugivorous birds such as toucans and cotingas and by small birds such as thrushes (Carvalho, Garcia, et al., 2021;Galetti et al., 2013). Euterpe edulis seeds are recalcitrant and moisture dependent for germination, with evidence suggesting that deforested landscapes under high light penetration have low potential for growth and survival of young individuals (Cerqueira et al., 2021;De Andrade, 2001). The population density of E. edulis varies significantly along its distribution area, with the state of BA presenting approximately 1305 ind./ ha, while in SP it can register populations with about 17,000 ind./ha (Leal et al., 2021;Reis et al., 2000), being the generation time of the species approximately 18 years (Carvalho, Lucas, & Côrtes, 2021;Franco & Silvertown, 2004;Galetti et al., 2013).
In the state of São Paulo, individuals were randomly collected within each fragment. In Bahia state, we allocated three 50 × 10 m plots, randomly located within each forest site, but maintaining a minimum interplot space distance of 50 m and of the nearest edge, for sampling E. edulis. In each forest site, we collected leaf material and georeferenced all the juvenile individuals of E. edulis with leaf insertion heights between 0.15 and 1.00 m and belonging to young class II, according to the classification of Silva et al. (2009). This ontogenetic stage was chosen because the genetic parameters of the early developmental stages better represent the effects of recent environmental disturbances (Vranckx et al., 2012). Also, in forest sites in which more than 50 individuals were collected, we randomly selected 50 individuals for genomic analysis. The number of individuals sampled per forest site was established to maximize the accuracy power of the detection of SNPs using sequencing of pooled DNA samples (Schlötterer et al., 2014). However, in eight sites, fewer than 50 individuals were found (Table S1).

| Laboratory and bioinformatics procedures
Genomic DNA was extracted from the leaf tissues of the 1063 individuals sampled in the 22 study areas (Table S1) using the protocol described by Doyle and Doyle (1990). Equimolar amounts of DNA from individuals from the same collection area were used to form 22 pools, one for each study area. We opted for the strategy of sequencing pooled DNA samples (pool-seq) due to its efficiency and reduced costs for obtaining SNP data in landscape genomics studies, which generally require a large number of samples (Santos & Gaiotto, 2020). This methodology has been reported as the best option for reducing costs related to next-generation sequencing in the analysis of no-model species (Schlötterer et al., 2014), as is the case for E. edulis.
To perform sequencing, libraries with barcodes were prepared for each of the 22 pooled samples, and 20,000 probes were used.
These probes were previously developed to capture specific transcribed regions of the Euterpe genome (Sales, 2022). The DNA captured by these probes was sequenced using an Illumina HiSeq 2000 platform (Illumina). After sequencing, the 3′ end was cut to remove low-quality bases (reading quality index < 20), and the reads were filtered to remove those with quality index values smaller than 10%.
Then, the reads selected after the trimming and filtering steps were aligned using MOSAIK 2.2 (Lee et al., 2014).
After these two steps, 323,839 SNPs were retained. However, as there was still a large variation in the quality of mapping (QUAL) and sequencing depth (DP), we performed an additional filtering step, retaining only the markers with QUAL ≥ 100 and DP ≥ 100 (Schlötterer et al., 2014). Subsequently, only the markers with QUAL and DP between the 5th and 95th quantiles were retained (i.e., 90% retention of the markers based on the QUAL and DP distributions). Of these, we retained only the markers with the highest DPs for each of the contigs, seeking to minimize the explicit linkage disequilibrium, resulting in 8296 SNPs. However, as the computational packages consider only biallelic SNPs, a total of 7632 markers were used for genetic analysis.

| Characterization of the global genetic structure
To assess the organization of genetic diversity and to define groups that reflect biologically plausible scenarios for conducting the tests to detect outlier markers, we performed a principal component analysis (PCA) and estimated the genetic divergence among populations based on all SNP loci. The PCA was carried out with the LEA package (Frichot & François, 2015) on the R platform (R Core Team, 2021). Genetic divergence between population pairs was estimated using the F ST described for pools of individuals by Hivert et al. (2018), which is derived from Cockerham and Weir's (1987) F ST , using the poolfstat package (Hivert et al., 2018) on the R platform (R Core Team, 2021). The F ST values obtained were represented on a heatmap produced with the heatmap3 package (Zhao et al., 2014) and associated with the number of groups estimated through a hierarchical clustering algorithm with the Euclidean distance and the average method. Because these analyses revealed a large genetic divergence between populations from the states of BA and SP ( Figure S1A,B) and it is known that there are climatic and biological differences between them, and that the history of anthropogenic disturbances differs between states, the tests to detect outlier markers, and the other genetic analyses were applied to the populations of BA separately from the populations of SP.

| Detection of markers potentially under selection
Among the most common tests to identify hypothetical selection signals in SNP markers (outliers) are those based on significant deviations from F ST (or related) estimates between populations (Fariello et al., 2013). Furthermore, environmental association analyses (EAAs) can be used to identify associations between genotypes or allelic frequencies with environmental variables (Ahrens et al., 2018). In this case, the SNPs significantly associated with environmental variables can be interpreted as a subset of markers related to selective pressures (Manel et al., 2018).
Although there are several methods for identifying selection signals with SNP markers, these tests generally reveal false positives, given their different assumptions and genetic models, which do not necessarily reflect the sampling design, the genetic structure of the population in question, and other covariates (Ahrens et al., 2018;Lotterhos & Whitlock, 2015). In this context, in an attempt to minimize the impact of false positives, and taking advantage of different assumptions, two tests were used to identify outlier SNP markers (pcadapt and X T X statistic) and two tests of environmental association analyses (BayPass -Bayesian population association analysis and LFMM -latent factor mixed models). Based on the results of PCA and F ST between E. edulis populations ( Figure S1A,B), the tests were applied to the 16 populations from Bahia separately from the six populations of São Paulo.
Unlike most methods, pcadapt (Luu et al., 2017) does not assume an explicit genetic model and does not require predefined groups of individuals. It performs genomic scans through principal component analysis (PCA) and assumes that outlier markers will be related to the population structure suggested by PCA. Each marker is assigned to a z-score value that quantifies it the relationship with the first K major components that best explain the observed genetic structure.
Based on this z-score, a Mahalanobis distance is calculated for each marker in relation to the other markers. Thus, outlier SNPs are those that present significant deviations from these distances in relation to the mean of the values of the other markers. This analysis was performed with the pcadapt package (Luu et al., 2017) on the R environment (R Core Team, 2021). For the analysis, K = 3 (Bahia) and K = 2 (São Paulo) were considered as principal components (the point at which the inclusion of more components does not cause a large increase in the cumulative proportion of variation explained by the PCA). We considered SNPs with q-values ≤0.1 as outliers, assuming that 10% of outlier SNPs can be false positives.
For ByPass and LFMM analysis, we used as covariates the percentage of forest calculated in each landscape, the richness and abundance of seed dispersers (birds that defecate or regurgitate palm seeds). For the two regions in BA, we used richness and abundance data for these birds sampled in each study area (for more information, see Morante-Filho et al., 2015), and richness data for the two regions in SP (Carvalho, Garcia, et al., 2021).
The BayPass (Gautier, 2015) presents methods both for identifying outlier markers and for identifying markers hypothetically associated with specific covariates. It explicitly considers the covariance structure between allelic frequencies of populations (Ω), resulting from their shared evolutionary history. To identify outlier SNPs, the X T X statistic (Günther & Coop, 2013) for each locus is used. Based on simulated data using initial estimates of Ω, new estimates of X T X are calculated to define threshold values for identifying outlier loci. For analysis of the association between genetic and environmental variation, BayPass implements a covariation model that assesses how population covariates are associated with each marker, considering linear regression coefficients (βik), using Bayesian statistics. For the analyses using the X T X statistic, allelic frequency covariance estimates between populations for each locus, and regression coefficients with environmental variables were estimated from 20 pilot analyzes with 2000 iterations of MCMC, followed by a burn-in period of 50,000 and 500,000 iterations of MCMC (Günther & Coop, 2013). After obtaining the allelic frequency covariance matrix between populations, the calibration of X T X values was performed with the same settings considering 5000 markers simulated with the simulate. baypass function. From this analysis, markers with X T X estimates above the 95% percentile of calibration estimates were considered as outlier loci. For the analysis of association between SNPs markers and environmental variables, Bayes Factor values (BF.dB) were used to compare two alternative models for each SNP (and each environmental covariate): model with association (βik ≠ 0) and model without association (βik = 0). We considered SNPs significantly associated with environmental variables those with BF.dB above the 95% percentile of the calibration estimates.
In the analysis with latent factor mixed models (LFMM), allelic frequencies are considered as response variables to environmental predictors. The LFMM considers that unobserved latent factors, such as demographic factors, are modeled as confounding effects.
After that, associations between environmental variables and markers can be interpreted as hypothetical signatures of natural selection (Frichot et al., 2013). The LFMM analyses were performed in the LEA package (Frichot & François, 2015) on the R environment (R Core Team, 2021). Based on PCA and F ST between pairs, we considered two population groups (K = 2) for each state ( Figure S1A,B), and 10 repetitions of the algorithm were performed with burnin of 50,000 and 100,000 iterations. We considered SNPs significantly associated with environmental variables those with a false discovery rate (FDR) ≤ 0.05.
In this study, we considered markers putatively under selection (outlier SNPs) if they were identified as outliers by at least two of these four tests simultaneously ( Figure S2A,B). In total, 1484 outlier SNPs in BA and 1053 outlier SNPs in SP were identified (Table S1), with an overlap of only 187 SNPs between BA and SP ( Figure S2C). The number and overlap of the outlier SNPs identified by these methods were represented using a Venn diagram using the VennDiagram in R platform (Chen & Boutros, 2011).
There was a very large variation in the number of SNPs identified by these analyses, demonstrating the importance of simultaneously using more than one method to minimize possible false positives (Ahrens et al., 2018;Lotterhos & Whitlock, 2015). Thus, ensuring greater reliability in distinguishing neutral SNPs from those putatively under natural selection allowed us to proceed with the analysis of genetic variability and statistical analysis with the two data sets separately for each state (BA and SP).

| Genetic variability analysis
We considered as neutral SNPs those makers that were not identified as outliers by at least two of the methods described above. The  (Table S1). Based on the allele frequencies of the neutral and outlier loci separately, Nei's (1978) gene diversity (H E ) was estimated and was corrected for finite population sizes (each locus and each population) according to the formula: where H E is gene diversity, n is the number of individuals per population, and p i is the frequency of the I the allele.
We also used principal component analysis (PCA) of the LEA package (Frichot & François, 2015) and F ST of the poolfstat package (Hivert et al., 2018) to assess the structure between pairs of populations. The test (Mantel, 1967) using the ecodist package (Goslee & Urban, 2007).
Next, to test whether the observed pattern of genetic differentiation (F ST ) was due to the isolation by distance (IBD) process, we performed F ST /(1 − F ST ) regression against the logarithm of geographic distances in E. edulis populations within each state (Rousset, 1997). All the analyses were performed in the R platform (R Core Team, 2021).

| Statistical analysis
We performed an unbalanced ANOVA followed by a Scott-Knott means test to assess whether the number of alleles or gene diversity of the neutral or outlier loci differed by sampled region. which could bias the models. To carry out this investigation, we used data from birds that regurgitate or defecate Euterpe seeds (Galetti et al., 2013) in the BA_SR and BA_NR regions (For sampling details, see Morante-Filho et al., 2015). In particular, we used three models: (1) a null model, (2) a univariate linear model, and (3) an ANCOVA model. In the linear model, we create different simple linear regression between the response variable (gene diversity or number of alleles) and the predictor variable (bird richness, abundance or forest cover). In addition, as the response variables could be influenced simultaneously by more than one predictor variable, we create models with interactions between different predictor variables. We also create several ANCOVA models, always using the study region as a categorical factor interacting with the different variables (bird richness, abundance or forest cover) and genetic parameters as the response variables. Then, we used the corrected Akaike information criterion for small samples (AIC C ) and Akaike weights to select the most plausible model (Anderson, 2008). The model presenting the smallest AIC C , with at least two units of difference from those of the other models, and with the largest Akaike weights was considered the most plausible. The models were create using the bbmle package (Bolker, 2012) and plotted with ggplot2 (Wickham, 2016). All the analyses were performed in R platform (R Core Team, 2021).
Our results also highlight that the average number of alleles per locus and neutral gene diversity differ only between the regions of the state of Bahia (BA_NR, BA_SR) and São Paulo (SP_RA, SP_SE), with the highest averages observed in SP (Figure 2a,b, respectively).
For the genetic variability estimated with the outlier SNPs, all regions differed significantly from each other, with the SP_SE region showing the highest average number of alleles per locus and gene diversity, followed by the BA_SR, SP_RA and BA_NR regions, respectively (Figure 2c,d).
In the analysis of the genetic structure based on the neutral loci in BA, the first two PCA axes explained 30.16% of the variation, clearly separating the populations of BA_NR from BA_SR ( Figure 3a). For the analysis of F ST with neutral loci, we recorded the formation of three groups, separating BA_NR from the other two groups formed by BA_SR populations, just as observed in PCA (Figure 3a,b). This same pattern of regional genetic structure was also registered in the analyses with the outlier loci in BA, in which the first two PCA axes explained 54.28% of the variation SNPs or outlier, as would be expected by the IBD process ( Figure S3).

| Influence of forest cover and avian seed dispersers on genetic variability
In the model selection process, the ANCOVA composed of the percentage of forest cover interacting with the region was the most plausible model to explain the genetic estimators assessed with the outlier SNPs. On the other hand, the null model was the most plausible to explain the genetic estimators accessed with the neutral SNPs (Table 1) the BA_SR region, while no pattern was evidenced in the BA_NR region (Figure 5a,b). It is also important to note that although the percentage of forest cover interacting with the region is the most plausible model, models with richness and abundance of birds interacting with the region factor are more plausible than the null model to explain the genetic estimators accessed with the set of outlier SNPs ( Table 1).

| DISCUSS ION
In the present study, we used for the first time, SNP markers in- in more deforested Atlantic Forest landscapes (Leal et al., 2021).
With this action, it would be possible to minimize the effects of deforestation on the species and increase the chances of small and isolated populations maintain genetic variability in a long term. As good sampling practices for conservation, we also suggest some reintroduction or enrichment in areas where the species has become extinct, using exclusively seeds from populations of the same target region of conservation management (Pereira et al., 2022).

| Genetic diversity and structure
In our study, we observed high neutral genetic variability among the 22 populations of E. edulis, as already reported for this species using SNP markers (Brancalion et al., 2018) and microsatellites (Carvalho et al., 2017;Pereira et al., 2022;Santos et al., 2015;Soares et al., 2019). This pattern may be related to the biology and range of occurrence of E. edulis given that allogeneic plants pollinated and dispersed by animals and with wide geographic distribution usually present high genetic variability (Lowe et al., 2018). Conversely, the high genetic variability observed with outlier SNPs may reflect the intensity of natural selection in these populations, indicating the distinct environmental conditions to which these populations are exposed (Bragg et al., 2015;Gao et al., 2021;Rellstab et al., 2015;Schoville et al., 2012) and, certainly, adapted.
When comparing the neutral genetic variability of E. edulis, we For the outlier loci, we showed that the mean number of alleles per locus and gene diversity differ significantly between the four studied regions, evidencing a pattern of natural selection possibly modulated by environmental characteristics (Bragg et al., 2015;Gao et al., 2021;Rellstab et al., 2015;Schoville et al., 2012). As this genetic variability is located in functional regions of the genome, which theoretically can influence the survival of individuals, it is assumed that the greater genetic diversity would be needed to adapt (i.e., heterozygote advantage) the several environmental adversities faced by individuals (Scott et al., 2020;Teixeira & Huber, 2021;Wernberg et al., 2018). Thus, we believe that there is greater genetic variability in E. edulis populations located in environments with more environmental adversities, whether natural or anthropogenic. In particular, there are many environmental or ecological factors that exert selective pressure, driving the adaptation of organisms, such as factors related to temperature, precipitation and soil properties (Bragg et al., 2015;Gao et al., 2021;Rellstab et al., 2017). In our study, regions more deforested, such SP_SE and BA_SR region (see  Table 1). In the ANCOVA model, we showed the regression lines by region, with the blue rectangles indicating the southern region (BA_SR) and the red dots indicating the northern region (BA_NR). The shaded areas correspond to the 95% confidence intervals of the models.
in addition to the existing natural variation, we believe that anthropogenic disturbances also exert important selection pressures on E. edulis populations (Carvalho et al., 2016;Cerqueira et al., 2021;Galetti et al., 2013;Leal et al., 2021).
When comparing geographic regions within their respective states, we found greater outlier genetic variability in BA_SR, suggesting that natural selection is more intense in this location highly deforested than in BA_NR. With the reduction of forest cover, a series of changes occur in the landscape, for example, an greater penetration of light, loss of seed-dispersing fauna and increase in seed predation by invertebrates, which negatively impact E. edulis populations (Carvalho et al., 2016;Cerqueira et al., 2021;Soares et al., 2015). In fact, studies carried out in the BA_SR and BA_NR region indicate that intense deforestation caused the local extinction of E. edulis, while microclimatic changes in deforested landscapes led to physiological changes, with a potential negative impact on the growth and survivability of young individuals (Cerqueira et al., 2021;Leal et al., 2021). For these reasons, we believe that deforestation can impose selective pressures, favoring the detection of genes under selection (i.e, outliers) and greater genetic diversity, modulating the local adaptation of E. edulis (Bragg et al., 2015;Feng et al., 2015). Similarly, when comparing the regions in SP, it is possible to suppose that in addition to the visible difference in the remaining forest quantity among SP_SE and SP_RA (Figure 1c), other environmental variables probably modulate the genetic variability (Brancalion et al., 2018;Carvalho et al., 2016). For example, while SP_SE is located in the interior of the state, with a marked dry season, SP_RA is located in the wet climate zone of the coast and does not have rainfall limitations throughout the year (Eisenlohr & de Oliveira-Filho, 2015). This difference climate can directly influence the germination of E. edulis seeds, which are recalcitrant and highly sensitive to desiccation (Brancalion et al., 2011). Thus, populations of E. edulis inserted in SP_SE could suffer greater stress compared with populations in SP_RA, which could explain our findings (Ahrens et al., 2019;Bragg et al., 2015;Feng et al., 2015). However, although our results show a very evident pattern, but due to the relatively small number of populations per region, it is important to highlight that our inferences can be influenced by noise from the analyses. Furthermore, considering the complexity of the mechanisms involved in adaptive responses in natural populations, it would be expected that multiple factors would simultaneously influence our results.
The characterization of the global genetic structure for the 22 populations revealed a great genetic divergence between the states of the BA and SP, with a subsequent subdivision within of each state ( Figure S1A,B). This clustering pattern may reflect several factors (e.g., biogeographic, climatic, ecological or environmental) that characterize and differentiate the gene pool in each state (Dan et al., 2020;von Takach et al., 2021). In particular, the differences among populations located in each state are more apparent from selection tests that identified 1484 outlier loci in BA and 1053 in SP, being only 187 SNPs shared between the states ( Figure S2C). These results further reinforce the importance of future study with the environmental genomic association approach, combined with the functional annotation of the genes where these SNPs are located, for genomic conservation of trees in the Atlantic Forest (Sales, 2022;Santos & Gaiotto, 2020).
When estimating the genetic structure of E. edulis populations in the Bahia state, a clear pattern of separation among the populations inserted in distinct regions (BA_NR and BA_SR) was evidenced, regardless of the sets of markers used in the analysis (Figure 3a-d).
This result can be partially driven by geographic distance, as demon-  . In contrast, the BA_NR region is more forested and composed by fragments more connected (See Figure 1a), which may facilitate gene flow between different populations of E. edulis, leading thus the genetic homogeneity .
In this context, the differences observed in the gene pool among regions can be a reflect of distinct evolutionary forces resulting from pressures imposed by divergent patterns of land-use change.
In the state of SP, we also observed a separation of regions into two groups, possibly due to the geographic distance between populations, as evidenced in our results, associated with biogeographic influences on composition of the gene pool of the species as reported in previous studies (Brancalion et al., 2018;Carvalho et al., 2016).
These results indicated that these biogeographically divergent ecosystems, with their own biotic and abiotic characteristics, present divergent adaptations, promoting differentiation in the E. edulis gene pool (Brancalion et al., 2018). Among such characteristics, we can highlight the differences in water availability and soil composition between SP regions, which are known to influence seed germination (Brancalion et al., 2018). Indeed, these abiotic factors can exert strong selection pressure, resulting in the local adaptation of plant species populations (Bragg et al., 2015;Gao et al., 2021;Rellstab et al., 2017;von Takach et al., 2021). In this context, future genomic studies associating environmental and climatic variables, as well as the biological functions related to these loci are needed to unravel the causes of the observed differences, which is of fundamental importance to predict the ability of E. edulis to respond to anthropogenic environmental changes that threaten this palm species (Santos & Gaiotto, 2020).  Galetti et al., 2013). Therefore, we believe that populations inserted in landscapes with a low percentage of forest may suffer a loss of genetic diversity in the short -medium term, if these populations remain small and isolated (Borges et al., 2020;Lowe et al., 2015;Santos et al., 2015;Soares et al., 2019). On the other hand, we did not find evidence that landscape-scale forest cover has influenced patterns of neutral or outlier genetic variability in the BA_NR region. It is important to highlight in this context that the BA_NR region has large fragments that are highly connected, indicating that there was no large targeted deforestation associated with the great similarity between populations, as evidenced by its genetic structure (Figure 3a-d). Therefore, we believe that this region behaves as a metapopulation, supporting the gene flow and generating genetic homogenization independent of forest cover in the landscape .  Markert et al., 2010;Reed & Frankham, 2003;Scott et al., 2020;Wernberg et al., 2018). Another aspect that deserves attention is the fact that forest loss influences only the SNPs identified as outliers, as demonstrated by our selection of models (Table 1). A plausible hypothesis for this pattern would be a rapid adaptive response of the population to the physiological stress that is known to occur in deforested landscapes in BA_SR (Cerqueira et al., 2021).

| Influence of forest cover and seed dispersers on genetic variability
Thus, considering that these SNPs are inserted in transcribed regions of the genome that theoretically can influence the survival of individuals, it would be expected that alleles of genes associated with responses to physiological stresses would increase in frequency in the population (i.e, outlier SNPs). On the other hand, if neutral SNPs are located in gene regions that are not associated with responses to the stresses that populations are currently exposed to, changes in genetic variability due to deforestation could be delayed by gene flow over short distances. This difference in the response of neutral and outlier SNPs to deforestation suggests that caution is needed in conservationist inferences when no influence of anthropic actions is detected under the neutral genetic variability of small and isolated populations (Teixeira & Huber, 2021). However, it is important to note that these results are based on 4252 neutral SNPs and 1484 outliers, evaluating only seven populations in BA_SR and nine in BA_NR, which could generate noise in our results, in addition to the need to experimentally validate this hypothesis.
In general, the landscape genomics approach used in the present study has advanced the understanding of genetic responses of perennial species dispersed and pollinated by animals in a scenario of rapid anthropogenic environmental changes. Thus, our results can assist in decision-making aimed at efficient strategies for the conservation of a target species as well as in predicting the responses of other perennial species with similar biology facing anthropogenic disturbances. In addition, our findings reveal the genomic effects caused by forest loss in tropical environments. They make it possible to predict the consequences of disturbances on genomic variability and, consequently, on the adaptive potential of species.

ACK N OWLED G M ENTS
This study was financed in part by the Coordenação de Lab at UESC for the scientific discussions of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors have no conflicts of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The files containing the sampling locations, the 7632 SNP markers identified and used in the study, the subset containing the neutral SNP markers and the subset with the SNP markers identified as outliers will be deposited in the Dryad database.