Population expansion and genomic adaptation to agricultural environments of the soybean looper, Chrysodeixis includens

Abstract Evolutionary studies of insect pests improve our ability to anticipate problems in agricultural ecosystems, such as pest outbreaks, control failures, or expansions of the host range. Here, we investigated the mechanisms underlying the evolutionary processes behind the recent census size expansion and local adaptation of Chrysodeixis includens. First, we sequenced mitochondrial markers to conduct a phylogeographic investigation of C. includens historical processes. Then, we combined a de novo genotyping‐by‐sequencing approach with a study of agricultural landscapes to uncover recent processes of adaptation. Primarily, we found low genetic diversity across all markers and clear indications of a recent demographic expansion. We also found a lack of significant isolation by distance (IBD), and weak or absent genetic structure considering geographic locations. However, we did find initial signs of population differentiation that were associated with host plant types (i.e., soybean and cotton). Agricultural landscape attributes, including soybean crops, were significantly associated with putative markers under positive selection. Moreover, positive selection associated with host differentiation was putatively linked to digestive enzymes. This study showed how landscape composition and host plants can affect the evolutionary process of agricultural pest insects such as C. includens.

not always stem from landscape changes, and a population's history and dynamics (i.e., recent speciation events, gene flow, genetic drift, and population demography) can also make it difficult to detect early signs of local adaptation.
Agroecosystem configuration and composition may be responsible for sudden changes in ecological and evolutionary dynamics, imposing strong selective pressure and leading to new forms of pests (Altieri, 1999;Alvarado-Serrano, Van Etten, Chang, & Baucom, 2019). This complex scenario can be accentuated in countries such as Brazil, where agriculture has rapidly intensified and expanded geographically to the Cerrado region (Brazilian savannas) in the past 50 years, compared to places where agriculture is long established (Arvor, Meirelles, Dubreuil, Bégué, & Shimabukuro, 2012;DeFries, Herold, Verchot, Macedo, & Shimabukuro, 2013).
However, the cost of this profound economic transformation is the loss and fragmentation of large portions of the native vegetation (Landis, 2017;Ribeiro, Metzger, Martensen, Ponzoni, & Hirota, 2009). The result is extensive uniform cultivation areas where only one or a few crop species dominate the landscape. Beginning in the 1970s, the development of new farming technologies, including new crop varieties and better soil management, allowed farming in new areas (Arvor et al., 2012;Müller, Rufin, Griffiths, Barros Siqueira, & Hostert, 2015). In addition to leading to large areas with only one or a few crops, reducing landscape heterogeneity, the use of crop varieties with early and extra-early cycles made possible two or three crop seasons annually, increasing the challenges for insect management. The succession of suitable hosts cultivated on such large scales can lead to the rapid evolution of insect pest traits such as expansion of the number of host species, insecticide/Bt resistance, and specialization (Andrade et al.., 2016;Santos, Specht, Carneiro, Paula-Moraes, & Casagrande, 2017). Furthermore, a continuous agriculture landscape may allow increased gene flow among meta-populations.
In recent decades, the soybean looper, Chrysodeixis includens (Walker, [1858]) (Lepidoptera: Noctuidae: Plusiinae), a native to the Americas, has emerged as a serious pest of soybean and cotton (Panizzi, 2013;Santos et al., 2017). This species is polyphagous and shows high fecundity and rapid development (egg-adult period of 30 days in tropical regions). Little is known about its dispersal abilities. Until recently, C. includens was described as a minor pest that rarely reached levels requiring control (Kogan, Turnipseed, Shepard, Oliveira, & Borgo, 1977;Panizzi, 2013). Because of its higher population levels during the last decade, C. includens has attracted more attention and become a major concern for growers in many parts of Brazil.
Recent studies using inter-simple sequence repeat (ISSR) markers in Brazilian populations of C. includens have found low genetic diversity and no genetic structure across the area investigated (Palma, Maebe, Guedes, & Smagghe, 2015). However, not much is known about the adaptive potential of the looper in response to the recent expansion of farming to central and northern Brazil.
Here, we used sequence fragments of the genes cytochrome c oxidase subunit I (COI) and cytochrome c oxidase subunit II (COII), in addition to genotyping by sequencing (GBS), to investigate first, if populations are locally adapted, and second, the mechanisms underlying the evolutionary processes behind local adaptation in croplands in Brazil. Our explicit goals were (a) to use mitochondrial markers to reveal details of the C. includens phylogeographic history in the Americas, (b) to measure the levels of population genetic structure in important agricultural areas in Brazil, and (c) to test whether positive selection is taking place in different geographic regions and cultivated crops; and if confirmed, then to investigate which environmental features are associated with the ongoing adaptive processes. In a broader context, our goal was to shed more light on the adaptive processes that allow previously unimportant insects to become significant pests. TA B L E 1 Information about sampled locations of 12 study areas in Brazil for SNP marker sequencing. Two locations (BACO and MTCV) were sampled during both the soybean-and cotton-growing seasons. N GEN refers to the number of insects successfully sequencing using SNP markers Considering the geographic range of soybean and cotton cultivation, we directed our sampling efforts to cover the most ecologically or climatically diverse combination of natural areas (the Atlantic Forest, Brazilian Cerrado, and Caatinga domains) and crop types that could provide a realistic view of the Brazilian agriculture mosaic (Table 1 and Appendix S1).
DNA was extracted from the insects by using an adapted CTAB (2%) extraction protocol and resuspended in 70 µl of Milli-Q water (Doyle & Doyle, 1987). DNA extraction quality control was first performed by visual inspection in agarose gel (0.8% w/v), followed by spectrophotometric and fluorometric methods. Samples were tested in a NanoDrop spectrophotometer and agarose gel to check for contaminants and DNA integrity, respectively, followed by a Qubit assay to estimate the DNA concentration in each sample. Only samples with a concentration above 15 ng/µl were included in the subsequent steps.

| Landscape and landscape genetic analyses
We used only satellite images with a maximum of two months' difference between the sampling time and the satellite image capture. We used the LecoS plug-in, which is based on metrics taken from FRAGSTATS v4, to estimate the landscape metrics based on the composition of each attribute in the landscape (Jung, 2016;McGarigal, Cushman, Neel, & Ene, 2012). For "patch-based metrics" (e.g., bare soil, native forest, soybean), land cover, landscape proportion, edge length, edge density [ED = (Total length (m) of the edge in landscape/ Total landscape areas m 2 ) *10,000], number of patches, and mean patch area were collected to describe landscape structure and composition. For the "landscape-level metrics," the Shannon diversity index (SHDI) and Shannon evenness index (SHEI) were estimated (Shannon & Weaver, 1949).
The SHDI is a measure of diversity that is commonly applied in community-ecology studies but can also be used to measure diversity in landscapes. When the SHDI is zero, the landscape contains only one patch, and therefore, no diversity is observed at that par- Principal component analysis (PCA) was used to reduce the dimensionality and to simplify pattern recognition of the landscape and climate variables. We selected land cover as our patch-based metric to study the response to the landscape composition at the genomic level. The 14 land-cover attributes were transformed with PCA to reduce the dimensionality, and the first PCA coordinate was used as a predictor. We selected land cover as our primary landscape variable based on our a priori knowledge that C. includens strongly responds to host availability in the field (Andrade et al., 2016;Santos et al., 2017). The same approach was also used for the climate variables; 19 WorldClim variables (www.world clim.org/bioclim) had their dimensional space reduced by PCA transformation. This approach allowed us to assess the position of each putative population in environmental space (climate + hosts). We used the coordinates of the projections on the first PC axis as our independent variables.
The importance of each variable in the PCA was presented as the variable contribution to the first PC axis, which was given by squaring the product of the loadings by the component standard deviation and dividing it by the square of the total PCA loadings multiplied by the component standard deviation (Kassambara, 2017).

| GBS library preparation and sequencing
Two libraries containing 75 samples each (150 insects in total), from 12 sample sites (Table 1), were processed using the GBS protocol (Elshire et al., 2011). Samples from three locations did not meet the quality standards and were therefore not included in the GBS library preparation. Only one enzyme (PstI, CTGCAG) was used during the genome digestion step to reduce genome complexity. After library preparation, DNA samples were quantified by qPCR, and the size distribution checked using an Agilent BioAnalyzer. Libraries were sequenced on an Illumina HiSeq 2500 (100-bp single-end reads) at the Animal Genome Centre at USP/ESALQ (see Appendix S3 for details).

| mtDNA sequence analysis
For COI-COII concatenated sequences, we calculated the haplotype diversity and nucleotide diversity parameters using DnaSP ver. 5 (Librado & Rozas, 2009). The analysis of molecular variance (AMOVA), using the samples collected as a putative population, was also carried out using Arlequin with parametric bootstrapping (1,000 replicates) and significance at the 5% level (Excoffier, Smouse, & Quattro, 1992). A network of median-joining haplotypes was generated using the PopART software program to reconstruct the genealogical relationships among the COI-COII haplotypes. The hypothesis of demographic expansion was tested using Tajima's D and Fu's Fs tests of selective neutrality in Arlequin 3.1 (Excoffier, Laval, & Schneider, 2006). Both tests used 1,000 random samples, using coalescent simulations and the "Infer from distance matrix" option for "Haplotype definition." We also tested the goodness of fit of the observed mismatch distribution to that expected under the spatial expansion model. The sum of squared deviations (SSD) and Harpending's Raggedness index statistics and their associated p-values were calculated using Arlequin 3.1 (Excoffier et al., 2006) with 1,000 replicates. Finally, we used additional COI sequences downloaded from BOLD Systems to estimate C. includens COI haplotype relationship and genetic diversity throughout the Americas. Details of the method are shown in Appendix S2.

| GBS data analysis
Raw sequences were demultiplexed, trimmed to 90 bp, and filtered according to the Phred scores using process-radtags in stacks 2.2, using default settings (Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013). We proceeded with the analysis using the five major modules available in STACKS in a de novo approach. During the first step, the reads were used to build RAD loci (restriction site-associated DNA marker loci) using ustacks to form putative loci present in each sample file. At this stage, two critical parameters are the minimum number of raw reads to form a putative allele (m = 3) and the number of mismatches allowed between stacks to merge them into a single allele (M = 4). Setting m to 3 has shown good performance for a broad range of data and represents a good trade-off between coverage and polymorphs (Paris, Stevens, & Catchen, 2017;Rochette & Catchen, 2017). The criterion to merge stacks into a single locus (M) is highly dependent on the data and required a balance on the homolog and paralogue loci detection. In general, the presence of a higher degree of polymorphs requires higher values for M, but here we tested a range of values (M1-M8) to identify where the parameter plateaus (Paris et al., 2017), which proved to be ~4 ( Figure   S3). Next, STACKS creates a catalog containing consensus loci using cstacks, using 4 as the number of mismatches allowed between sample loci (n = 4). We used the general criteria of M = n. In the following step, STACKS matches loci found in each individual against the catalog, using default settings. Using this approach, STACKS genotyped a total of 340,089 loci with mean effective per-sample coverage of 97.7× (stdev = 27.9×), minimum coverage 10.1×, and maximum cover- Last, we used the population module to filter missing data, retaining only loci present in 80% or more individuals within a sampling location (r = .8) and ~80% or more across sampling locations (p = 8). Minor allele frequency cutoff was set at 0.05, and maximum observed heterozygosity limited to 50%. The resulting dataset was saved as VCF, genepop, and structure files and converted to other formats using PGDSPIDER 2.0 when necessary (Lischer & Excoffier, 2012). The percentage of missing data allowed and the number of markers implemented to estimate parameters can significantly affect estimates of genetic diversity; therefore, we calculated genetic diversity parameters varying both the overall percentage of missing genotypes and the number of SNPs to find the most stable values. In our dataset, the most stable estimates (i.e., that stop varying as we increase the number of SNPs used) were acquired with more than 5,000 SNPs, while no more than 80% of overall missing genotypes were allowed (see Appendix S3).

| Demographic history
We reconstructed the demographic history of C. includens using the multi-locus method Extended Bayesian Skyline Plot (EBSP) implemented in BEAST version 1.8.2 (Bouckaert et al., 2014;Drummond, Suchard, Xie, & Rambaut, 2012). Because of the lack of genetic structure found in the F ST and structure analysis, we randomly selected 50 neutral sequences (not flagged in our analysis as putative candidates under selection) from a simple pooled population to perform the demographic reconstruction analysis (Calderón et al., 2016). Site model and clock model partitions were linked, whereas the tree partitions remained unlinked. The HKY substitution model was implemented because it allowed us to work with fewer parameters. A strict clock model was implemented using a fixed mutation rate across the genome of 2.9 × 10 −9 mutations/site/generation. The mutation rate was selected based on similar organisms, as no value is available for this species (El-Shehawi & Elseehy, 2016;Keightley, Ness, Halligan, & Haddrill, 2014;Oppold & Pfenninger, 2017). We assumed that the species has 10 generations per year. The chain length of 100 million generations was sampled every 4,000 states, and a 20% burn-in was implemented in three independent runs. The results were inspected in Tracer, examining the posterior ESS (effective sample size) and the convergence among the runs.

| F ST , population structure, and gene flow
We assessed genetic partitioning by means of the software STRUCTURE, using a single SNP per locus (i.e., --write_single_snp was used in population module in STACKS) (Pritchard, Stephens, & Donnelly, 2000). An initial burn-in of 1.5 × 10 5 was used, followed by 5 × 10 5 Markov chain Monte Carlo (MCMC) interactions, using 10 replications for each K value without prior population information.
The analyses were run from K = 1 to K = 10, allowing allele correlation and an admixture model. The most likely number of clusters was determined using the Evanno method implemented in STRUCTURE HARVESTER 0.6.93 (Earl & vonHoldt, 2012). The STRUCTURE results were averaged using CLUMPP (Jakobsson & Rosenberg, 2007) and visualized using DISTRUCT (Rosenberg, 2004). Additionally, we used principal component analysis (PCA) implemented in ade4 and adegenet to investigate the underlying structure.
Pairwise differentiation between sampling locations was estimated based on F ST estimation calculated in STACKS (Weir, 1996). A heatmap was produced to illustrate the F ST relationships, using the R program for both the complete set of SNPs and outliers. To test the correlation between genetic distance and geographic distance, a Mantel test, including an all-pairwise-comparison matrix, was performed with 10,000 permutations, using the R libraries ecodist (Goslee & Urban, 2007) and ade4 (Dray & Dufour, 2007).

| Outlier detection and associating genomic variation with environmental characteristics
We used Lositan to detect putative loci under natural selection. The method implemented in Lositan simulates the joint distribution between F ST and heterozygosity (H e ) under neutrality (Antao, Lopes, Lopes, Beja-Pereira, & Luikart, 2008), assuming the island model of migration with neutral markers. The neutral mean F ST was calculated using 100,000 simulations, with a confidence interval of 0.99 and a false discovery rate (FDR) of 0.05. To assure more stringent detection parameters, we selected the "force mean F ST " and "neutral mean F ST " options. We also used a Bayesian approach that estimated the posterior probability of a given locus to be under selection, contrasting this assumption with the neutral model. A second analysis was implemented in BayeScan v2.1 (Foll & Gaggiotti, 2008). A total of 20 pilot runs with 5,000 interactions each were generated before computing Markov chains. Burn-in was set to 50,000 steps, and the chains had 5,000 interactions with a thinning interval of 10. We considered prior odds of 10 and a threshold of a posterior P of >0.95 to select an SNP as "under positive selection." In both analyses, we contrasted sampling locations, except the cotton areas (BACOC and MTCVC) that partially overlapped in time and space with soybean areas (BACOS and MTCVS). Contrasts between soybean locations and cotton locations were performed separately.
We used a series of partial Mantel tests with 10,000 randomizations to test the correlation between the variation in F ST outliers (putative loci under directional selection) detected by both Lositan and BayeScan, and landscape and climate variables. The partial Mantel test accounts for the nonindependence of points by creating a random distribution to use as a null distribution and test for significance (Mantel, 1967). Additionally, a partial correlation for each variable is Additionally, the presence of loci that correlate with environmental variables (climate and landscape composition) was also tested using a latent factor mixed-effect model (LFMM) implemented in R package LEA (Frichot & François, 2015). The advantage of the LFMM algorithm is to correlate markers and environmental values while taking into account possible levels of population structure (i.e., latent factors). We defined latent factor 2 (K = 2) based on "snmf." Again, for this analysis, individuals collected from cotton landscapes were removed, as were individuals with more than 10% missing data. The two variables were used to test the associations with climate and landscape. To test the association, we used the coordinates of the first PC of WorldClim variables and the landscape composition (percentage of each component of the landscape) described in the previous section ("Landscape and landscape genetic analyses"). The Gibbs sampler algorithm was run for 10,000 cycles, following a burn-in of 5,000 cycles. To control for false-positive detection, we used the Benjamin-Hochberg algorithm, adjusting the false discovery rate (FDR) q = 10%. Putative loci under selection detected by more than one method were used as queries in a BLASTx search against the NCBI database, using Blast2GO (Götz et al., 2008). Ontologies with e-value <1×10 −3 were used for annotation.

| Mitochondrial COI and COII variation
The mitochondrial network (COI-COII) of Brazilian C. includens revealed a star-like topology of the haplotype relationships, with a central haplotype (haplotype H1) that is widely distributed and in a high frequency, and additional haplotypes at low frequencies (11 total), connected to the central haplotype by a single mutation step ( Figure 1b). The ubiquitous haplotype H1 was recovered from 73.8% of the individuals and was present at all sample sites. Eight haplotypes were singletons (i.e., they occurred only once in our dataset), and three were present in low frequency (H2, H6, and H12) and geographically restricted to certain sample sites.

| Demographic history
The analyses of both mitochondrial sequences and SNP markers indicated a demographic expansion of C. includens in Brazil. Neutrality

tests based on COI-COII indicated a population expansion, with
Tajima's D (D = −2.21; p < .001) and Fu's Fs (Fs = −12.16; p < .001). In addition, the model of a sudden expansion did not reject the hypothesis of a spatial expansion for C. includens (SSD = 0.0002; p = .60; Raggedness = 0.1090, p = .60). Similarly, the EBSP analysis based on 50 nuclear sequences showed a signature of recent demographic expansion in the C. includens populations. Based on the assumed molecular clock calibration (2.9 × 10 −9 mutations/site/generation), the expansion started ~300 years before the present (Figure 1c).

| Landscape/environmental composition
There was a tendency for agricultural landscapes with higher Shannon diversity indexes to also show more evenly distributed proportions of vegetation cover types across the different classes, although this relationship was not statistically significant (r = .47, p = .17) (Figure 2b).
Landscapes with a higher diversity of patch-fragment classes and more evenly distributed patches were found in MGAR and SPCB. The  Figure 4a). However, nonsignificant isolation by distance at the 5% level was detected (r = .30, p = .06). The STRUCTURE analysis confirmed a low degree of differentiation among the sampling locations. However, a higher value of K (K = 4) was detected by the Evanno method (Earl & vonHoldt, 2012), even though populations were fully admixed (Appendix S7).

| Landscape/Environmental correlation and association study
Lositan yielded   were used to control for secondary factors, a slight increase in the correlation between the landscape distance and genetic distance was observed (Table 3).
Using a latent factor of 2 and controlling FDR at 10%, we obtained 39 candidate SNPs associated with the landscape variables, and 7 associated with climate variables, using the LFMM approach

| D ISCUSS I ON
Our results based on the mtDNA and SNP markers provided strong support for the assumption that the C. includens population is undergoing demographic expansion. The results of the analysis (considering only individuals sampled in Brazil) are consistent with the history of diversification and demographic expansion observed in the Americas (i.e., similar haplotype network patterns). The demographic reconstruction traces the beginning of the C. includens population expansion back to the year 1,700, coinciding with the period of colonization of the Americas. However, the demographic reconstruction (i.e., Skyline analysis) must be cautiously interpreted, given that the genome-wide mutation rate for this particular species is not known (El-Shehawi & Elseehy, 2016;Keightley et al., 2014;Oppold & Pfenninger, 2017).
The pattern of low diversity and genetic structure observed with the mtDNA markers is congruent with nuclear markers and supports previous findings with ISSR markers (Palma et al., 2015).  presence and phenological stage of local soybean crops, where the period of high moth density in the area lasted only 60 days (Santos et al., 2017). This would indicate an abrupt population decrease, followed by a complete absence of insects from the area for a pro-

| Natural selection and local adaptation
We have found compelling pieces of evidence that natural selection is shaping populations of C. includens in Brazil. These adaptive processes can be linked to the recent change in the pest status of C. includens. The insects are widely distributed and therefore experience environmental and habitat-associated factors differently at different locations. Hence, both abiotic factors such as temperature and humidity, and biotic factors such as host plants can be putative drivers of the evolutionary process through natural selection. Based on the selection screenings, landscape composition seems to be a more important driver of the adaptive process than climate conditions.
Inevitably, extensive monoculture areas will not only favor the performance of certain herbivores but impose strong selection pressures on the populations of pest insects (Via, 1990;Wetzel, Kharouba, Robinson, Holyoak, & Karban, 2016 (Bernardi et al., 2012;Stacke et al., 2019;Yano et al., 2016). However, this scenario can become more problematic when we consider the advance of agricultural frontiers into northern Brazil, increasing the soybean-producing area even further.
Considering the rapid habitat transformations in the Cerrado region in recent decades, local adaptation is an expected response for a population under such intensive selective forces in highly homogeneous areas (Mopper & Strauss, 1998). The greatest concern is that, once present, resistance to Bt or insecticides would spread quickly through the population. Among the proteins that may be associated with the management of pest-insect resistance and host adaptation, a UDP-glycosyltransferase protein (detected by LFMM and Lositan) has been detected (Kaplanoglu, Chapman, Scott, & Donly, 2017;Krempl et al., 2016). Further functional analysis and continued resistance monitoring may help to assess whether resistance is developing and spreading.

| Host selection and agricultural landscapes
Understanding how genetic diversity is distributed over the range of a species is essential to study patterns of gene flow and to assess adaptive trends. Knowledge of the genetic structure of populations will enable us to more reliably infer which biotic and abiotic factors may be responsible for gene-pool isolation or the lack of it (Avise, 1992;Slatkin, 1987). Habitat patchiness and host plant selection and distribution can counteract the effect of gene flow, producing genetic structure at fine spatial scales (Mopper, 1996). The evolution of host-specialized races within polyphagous species is often reported (Forbes et al., 2017;Mopper, 1996;Thompson, 1988;Wood, Tilmon, Shantz, Harris, & Pesek, 1999). The formation of sympatric races in pest populations is not well understood, but extensive crop fields impose strong selection pressures on herbivory-related adaptations, causing evolutionary divergence at fine scales (Carroll & Boyd, 1992;Feder et al., 1994;Gouin et al., 2017;Via, 1991). It is difficult to predict whether the differentiation by host reported here will increase or will be eroded by gene flow in the long term, particularly in unstable environments such as the agricultural systems. Moreover, the difference between insects collected from cotton and soybean plants was evident in only one of the two sampling locations (BACOC), for which the Blast2GO results revealed that feeding behavior associated with aminotransferase, amidases, carboxypeptidase, and carboxylase might be changing. BACOC had a much less diverse landscape, where cotton crops cover up to 14% of the land, and soybean was absent during the cotton-growing period, whereas MTCVC had a more diverse landscape, where cotton covered only 5% of the total area, sharing the landscape with soybean fields. The smaller proportion of cotton in MTCVC, the greater degree of fragmentation (smaller patch sizes), and the presence of soybean could explain the different outcomes regarding the natural selection process in C. includens when cotton plants were the main driver. Further studies are needed to reinforce the evidence of host-related race formation and the functional importance of the candidate genes listed here.

| CON CLUS IONS
Both the mitochondrial and SNP markers revealed a pattern of low genetic diversity and a weak or absent genetic structure in C. includens in Brazil. This pattern could be attributed to recurrent evolutionary processes such as gene flow, or to a historical process associated with demographic expansion and host-related adaptations. Analyses of the landscape composition suggested that the increase in the soybean crop area is likely important for the process of local adaptation, in particular in relation to host-specific adaptations. Landscape composition and configuration may explain the still-incipient differentiation between the hosts; and no effect of climate variables was detected. Our results reinforced the general recommendation for polyculture and for more diverse agroecosystems as strategies to prevent the rapid evolution of undesirable traits.
Polyculture including one or more nonhost plants often limits the total number of pest insects relative to the numbers in monoculture systems, by reducing food resources and offering more opportunities for antagonistic ecological interactions with natural enemies, for instance (Risch 1981).
The present data increase our understanding of how the host range and future problems of Bt and pesticide resistance could affect C. includens. Understanding how adaptive processes shape the evolutionary trajectory of pest populations will not prevent the ongoing transformation of natural populations, but can guide us in a search for better resistance monitoring and crop-succession management over time.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R S' CO NTR I B UTI O N
ASC, CSS, EMGC, GH, PMD, and RAC conceived and designed the study. CSS collected the data. ASC, CSS, and EMGC analyzed the data. ASC, PMD, and RAC provided reagents, analytical tools, and financial support. ASC, CSS, and EMGC wrote the manuscript. All authors read, corrected, and approved the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
Detailed information regarding the parameter selection in stacks, data curation, haplotype information, and landscape descriptions is included in the supplementary material. Raw fasta files of Illumina sequences were included in the SRA-NCBI repository (PRJNA576576).