Genetic diversity of the rain tree (Albizia saman) in Colombian seasonally dry tropical forest for informing conservation and restoration interventions

Abstract Albizia saman is a multipurpose tree species of seasonally dry tropical forests (SDTFs) of Mesoamerica and northern South America typically cultivated in silvopastoral and other agroforestry systems around the world, a trend that is bound to increase in light of multimillion hectare commitments for forest and landscape restoration. The effective conservation and sustainable use of A. saman requires detailed knowledge of its genetic diversity across its native distribution range of which surprisingly little is known to date. We assessed the genetic diversity and structure of A.saman across twelve representative locations of SDTF in Colombia, and how they may have been shaped by past climatic changes and human influence. We found four different genetic groups which may be the result of differentiation due to isolation of populations in preglacial times. The current distribution and mixture of genetic groups across STDF fragments we observed might be the result of range expansion of SDTFs during the last glacial period followed by range contraction during the Holocene and human‐influenced movement of germplasm associated with cattle ranching. Despite the fragmented state of the presumed natural A. saman stands we sampled, we did not find any signs of inbreeding, suggesting that gene flow is not jeopardized in humanized landscapes. However, further research is needed to assess potential deleterious effects of fragmentation on progeny. Climate change is not expected to seriously threaten the in situ persistence of A. saman populations and might present opportunities for future range expansion. However, the sourcing of germplasm for tree planting activities needs to be aligned with the genetic affinity of reference populations across the distribution of Colombian SDTFs. We identify priority source populations for in situ conservation based on their high genetic diversity, lack or limited signs of admixture, and/or genetic uniqueness.


| INTRODUC TI ON
Albizia saman (Jacq.) Merr. (Fabaceae) is a multipurpose tree species occurring naturally in the seasonally dry tropical forests (SDTF) from southern Mexico to Colombia and Venezuela (Cascante, Quesada, Lobo, & Fuchs, 2002;Durr, 2001). It is valued for its edible fruit pulp and medicinal properties (Leonard & Sherratt, 1967) and the production of an exudate with industrial applications (Subansenee, 1994), whereas its wood is exploited for manufacturing furniture and crafts (Escalante, 1997). However, the by far most important use of this wide-canopied tree is in agroforestry systems (Figure 1), owing to its rapid growth, the shade provided by its thick foliage, the nutrient-rich fodder produced by its leafs and fruits, and positive effects on the productivity of soils and grazing land (Allen & Allen, 1981;Durr, 2001;Roshetko, 1995).
These useful traits have been a major motivation for the introduction of A. saman from its native distribution in Central and northern South America to other tropical areas in the Americas and the rest of the world, where it has often become naturalized (CABI, 2018) ( Figure 2).
The sustainable use and effective conservation of A. saman requires detailed knowledge of the species' genetic diversity across its native distribution range of which surprisingly little is known to date. Only one study investigated the effects of fragmentation on the reproductive success and genetic structure of the species in northwestern Costa Rica (Cascante et al., 2002). While a better understanding of population responses to threats like fragmentation is essential for guiding conservation and management interventions, it needs to be supplemented with information on genetic differentiation of populations across their distribution ranges.
Aside from the idiosyncrasies of their life history traits and reproductive biology (Duminil et al., 2007;Lowe et al., 2018), the contemporary genetic structure of tree species is influenced most notably by their response to past changes in climate and more recent anthropogenic disturbances. One of the most important impacts of climate change in the recent past on neotropical SDTFs is conceptualized through the dry forest refugia hypothesis (DFRH) (Mayle, Beerling, Gosling, & Bush, 2004;Pennington, Prado, & Pendry, 2000;Prado & Gibbs, 1993). The DFRH postulates that the current wide distribution of numerous tree species in disjointed areas of SDTF is the result of the contraction of an extensive and continuous formation of the biome during the last glacial period (18,000-12,000 BP) to the remnants observed today (Pennington et al., 2000;Prado & Gibbs, 1993).
The postglacial isolation of tree species populations in different SDTF fragments is likely to have initiated processes of genetic differentiation but seems to have been too short to be detected in the current population genetic structures. An increasing body of evidence suggests that the formation of different genetic groups in SDTF tree species may predate the late Pleistocene (Bocanegra-González et al., 2018;Caetano et al., 2008;Collevatti et al., 2012;Vitorino, Lima-Ribeiro, Terribile, & Collevatti, 2016). According to the DFRH, one would thus expect a similar disjointed distribution of genetic groups within tree species present in different SDTF fragments that used to be connected during the last glacial period. Here, we test this hypothesis for A. saman in Colombian SDTF.
Tree species populations in SDTF fragments have been subject to anthropogenic disturbance since pre-Columbian times (Banda-R et al., 2016;Murphy & Lugo, 1986). Conversion of mature SDTF in Colombia to land for human settlements and crop production and pastures for cattle ranching intensified during the European colonization period and particularly so during the past century (Etter, 2015;Vina & Cavelier, 1999). Today, STDF is one of the most threatened ecosystems worldwide (Janzen,1988;Miles et al., 2006). In Colombia, less than 8% of the original STDF cover remains, occurring in a highly fragmented state González-M et al., 2018). Particularly in predominantly outcrossing species such as A. saman, fragmentation of populations is known to negatively affect the reproduction, gene flow, and genetic diversity of tree populations, resulting in increased risk of inbreeding depression in progeny and loss of genetic diversity and fitness due to low numbers of mating partners and low pollen diversity (Aguilar, Ashworth, Galetto, & Aizen, 2006;Aguilar, Quesada, Ashworth, Herrerias-diego, & Lobo, 2008;Lowe, Boshier, Ward, Bacles, & Navarro, 2005).
Here, we elucidate the genetic diversity, structure, and inbreeding state of A saman populations across the main SDTF fragments in Colombia, located in the Caribbean region and the Cauca, Magdalena, Chicamocha, and Patia river valleys (Pizano, Cabrera, & García, 2014), and relate these to the effects of past climate change and anthropogenic disturbance. Based on our findings, we identify K E Y W O R D S agroforestry, climate change, microsatellites, paleodistribution, seed zones, suitability modeling F I G U R E 1 Typical use of Albizia saman as shade and fodder tree in pasture land in Colombia priority areas for in situ conservation of A. saman and make some recommendations on the potential use of populations as seed sources in future tree planting efforts.

| Field sampling
We collected leaf material from 100 reproductive individuals of A. saman between July 2014 and January 2016 across twelve representative locations of STDF in Colombia. Sampled trees were separated by at least 50 m to avoid the collection of highly genetically related individuals (Gonzalez & Quintero, 2017

| DNA extraction and PCR amplification
Young and healthy leaves of sampled A. saman trees were preserved in paper bags and dried with silica gel prior to processing in the laboratory. Total genomic DNA was isolated from dried plant material using 80 mg of leaf tissue in accordance with the CTAB method (Doyle & Doyle, 1990) with modifications following Alzate-Marin, Guidugli, Soriani, Martinez, & Mestriner, 2009;Novaes, Rodrigues, & Lovato, 2009;Verbylaite, Beisys, Rimas, & Kuusiene, 2010. Genetic characterization was carried out by means of twelve specific microsatellite markers (Kasthurirengan, Xie, Li, Fong, & Hong, 2013).
Each PCR reaction was carried out in a total volume of 15 μl con-

| Diversity mapping and genetic structure
We visualized geographic patterns in nSSR diversity on raster maps of 30 arc seconds resolution by constructing circular neighborhoods of 10 arc minutes diameter (~18 km at the equator) around the locations of all the genotyped A. saman, following Thomas et al. (2012). In practice, this means that each tree was replicated in all the 30 arc second grid cells contained in a circle with diameter of 10 arc minutes constructed around its location.
As this replication exercise resulted in different numbers of trees per grid cell, in a next step we performed a sample bias correction by calculating genetic parameters as the average values obtained from 1,000 bootstrapped subsamples of the minimum sample size of 3 trees per grid cell. Grid-based calculations of genetic parameters included allelic richness, the Shannon information index, expected and observed heterozygosity, the inbreeding coefficient, and the number of locally common alleles (LCA) per locus. LCA are alleles that are restricted to a limited area of a species' distribution (here < 25% of the sampled populations) but reach relatively high F I G U R E 2 Global distribution of Albizia saman (red dots). Its native area is believed to be restricted to the region from southern Mexico to Colombia and Venezuela, but it has been introduced to tropical areas all around the world. The countries that are believed to be part of the native range of the species are shown in green, and the locations of the trees sampled in the current study are shown as yellow dots frequencies (here > 5%) in those areas. High LCA richness can be indicative for the level of genetic isolation of populations (Frankel, Brown, & Burdon, 1995a) and can hence be helpful for identifying putative refugia (Marchelli, Thomas, Azpilicueta, Zonneveld, & Gallo, 2017;Thomas et al., 2012).
We submitted our data to Bayesian cluster analysis in STRUCTURE (Pritchard, Stephens, & Donnelly, 2000) using an admixture ancestry model without consideration of sampling localities. The number of groups (K) tested varied between 1 and 8, using burnin periods of one million steps and 10 million additional replications. For each value of K, we carried out 10 independent repetitions. We used the method of Evanno, Regnaut, and Goudet (2005) for detection of the most probable number of genetically homogeneous clusters (K), through calculation of ΔK as implemented in the STRUCTURE HARVESTER software (Dent & VonHoldt, 2011). Complementary genetic analyses such as F ST (Nei, 1973) and AMOVA (Excoffier, Smouse, & Quattro, 1992) were carried out in R packages adegenet (Jombart, 2008) and poppr (Kamvar, Tabima, & Grünwald, 2014).

| Suitability modeling
We characterized the spatial distribution of favorable habitat for A. saman in Colombian SDTFs under different climatic conditions by means of suitability mapping based on ensembles of modeling algorithms, implemented in R package BiodiversityR (Kindt, 2018). Presence data collected during our field sampling were complemented with Colombian records extracted from numerous sources (www.gbif.org; the national herbaria MEDEM, HUA, MEDEL, COL, CUVC, VALLE and TULV; www.dryfl or.info; www.orino quiab iodiv ersa.org; www.sibco lombia.net). We only included records located in SDTF as defined by the combination of Etter, McAlpine, and Possingham (2008) and García et al. (2014). As a result, 151 unique presence points were used for suitability modeling. Background points (an overall maximum of 10,000 and maximum one per grid cell) were randomly selected from the area enclosed by a convex hull polygon constructed around all presence points and extended with a buffer corresponding to 10% of the polygon's largest axis. We applied two different strategies for suitability modeling under past and future climate conditions. Model calibrations for projections to LGM and mid-Holocene climate conditions were carried out at 2.5 arc minutes and 30 arc seconds resolution, respectively, using only WorldClim climate layers (Hijmans, Cameron, Parra, Jones, & Jarvis, 2005) as explanatory variables. Model calibrations intended for projections to future climate scenarios (period 2040-2069; referred to as 2050s) were carried out at 30 arc seconds resolution, using aside from climate layers also altitude, slope, aspect, terrain roughness, direction of water flow, and seven major edaphic variables, obtained from ISRIC-World Soil Information (Hengl et al., 2014) with a geographical null model (Hijmans, 2012). We compared the cAUCs of each of the individual distribution models with the cAUCs of the geographical null model resulting from twenty iterations, by means of Mann-Whitney tests. Only models that gave cAUC values that were significantly higher than the null model were retained for the construction of different model ensembles (Tables S1 and S2). In a next step, we calculated the cAUC values for all possible ensemble combinations of the retained models that included MAXENT, which is generally considered a superior suitability model (Tables S3 and   S4). This resulted in 1,024 possible ensemble combinations for projections to current and future climate conditions and 8,192 possible ensemble combinations for projections to past climate conditions.
Each ensemble combination was constructed as the weighted average of its individual composing models, using their respective average cAUC values as weights. The ensemble that yielded the highest cAUC value was considered to generate the most appropriate scenario for projecting to past and future climate conditions, respectively (Table S5).
To assess habitat suitability under mid-Holocene and LGM climate conditions, we carried out projections to two and three climate

| RE SULTS
All twelve microsatellite markers yielded highly variable allele numbers per locus, ranging from 13 to 28 alleles. Overall, different genetic diversity measures (allelic richness, Shannon diversity, and expected heterozygosity) suggest that most of the sampled populations hold comparable levels of diversity (Table 1) Table 1). Figure 4 shows the spatial distribution of the genetic diversity parameters against a background of the spe-cies´ current habitat suitability and the historical distribution of SDTF in Colombia.
Analysis of molecular variance (AMOVA) indicated that 10.6% of the total genetic variation resided between populations, compared with 89.4% between individuals within populations. Pairwise F ST values between sampling areas were generally low, with the exception of Patia which yielded a mean F ST value of 0.14, which was twice as high as the population with the second highest mean value (ZAT, F ST = 0.07; Table S6).
Analyses carried out in STRUCTURE showed support for two highly differentiated genetic clusters (K = 2): One grouping trees sampled in the Patía river valley (PAT) and another one grouping all individuals from the rest of the country ( Figure S1a). However, ΔK computation also showed support for K = 4, identifying 3 different subclusters in the second group ( Figure S1a). Repeating the analysis with the exclusion of Patía samples similarly resulted in support for K = 3 ( Figure S1b). Most of the sampling sites outside of Patia were composed of individuals assigned to two or three different clusters ( Figure S1c; Figure 5). It is important to note here that due to the modest sampling sizes at some sampling sites, signals of potential genetic differentiation have to be interpreted with caution.
The modeled distributions of suitable habitat during past climates ( Figure 6) suggest that A. saman populations from SDTF in

| D ISCUSS I ON
We assessed the genetic diversity distribution of A. saman populations across Colombian SDTF fragments and how it may have been shaped by past climatic changes and more recent human influences.
Our habitat suitability models during the LGM and mid-Holocene are consistent with the DFRH (Mayle et al., 2004;Pennington et al., 2000;Prado & Gibbs, 1993). Joint interpretation of modeling results and the genetic characterization data gives clues about the origin of the four genetic groups we identified in A. saman and suggests that the genetic differentiation of these groups is likely to predate the LGM, in line with similar findings for other SDTF TA B L E 1 Genetic parameters of Albizia saman estimated in 12 sampling sites located across Colombian seasonally dry tropical forests species (Bocanegra-González et al., 2019Caetano et al., 2008;Collevatti et al., 2012;Thomas et al., 2017b;Vitorino et al., 2016). Natural seed dispersal of A. saman is carried out by rodents, tapirs, and peccaries (Allen & Allen, 1981;Durr, 2001). However, in prehistoric times now extinct Pleistocene horses may also have been important, a role which today is likely to have been taken over by domesticated horses and cows (Janzen & Martin, 1982).

Due to A. saman's popularity as a shade tree in pastures and farm
land since the initiation of the European colonization and possibly before, reproductive material has been distributed extensively through human intervention, either as seeds or seedlings, or as seeds in the gut of domestic animals. As cattle breeding in Colombia became significant only in the seventeenth century (Etter, 2015), it is unlikely that the cultivation or human movement of A.saman for providing shade (and fodder) to cattle would have started much earlier. The largest A. saman trees we measured were found at La Paila (PAI) ( Figure S2). All trees with DBH 3.5-4.5 m (which might be more than 400 years old; CABI,2018) were assigned to cluster 3, suggesting that this cluster might have orig-  (Table S6).
This group also showed the lowest levels of diversity of all sampling areas, in line with similar findings for Ceiba pentandra (Bocanegra-González et al., 2018). This is likely the consequence of long-lasting processes of genetic isolation exacerbated by more recent impact of anthropogenic degradation.
The absence of signs of inbreeding in nearly all localities we sampled, despite centuries of vegetation degradation, might be due to the combination of the admixed nature of populations ( Figure 5) and the effective gene flow occurring between trees even in a fragmented landscape mosaic. Albizia saman is self-incompatible and pollinated by moths in the Sphingidae family which are able of crossing distances >500 m and visiting several trees during one night (Cascante et al., 2002;Haber & Frankie, 1989). Only the trees sam- Despite the potential negative consequences of fragmentation, Cascante et al. (2002)   . The selection of germplasm that is adapted to different planting conditions is not always trivial and requires knowledge on the nature of genotype-by-environment (GxE) interactions, which is typically derived from provenance and progeny trials . Such trails for A. saman currently do not exist or are not mature enough in Colombia to guide decision making. In the absence of GxE data, the genetic groups we identified here can serve as a first entry point to guide the selection of adapted planting materials and avoid genetic pollution (Azpilicueta et al., 2013;Thomas et al., 2017a). Particularly in areas where all or most trees we sampled pertained to one single genetic cluster, such as the Patia (PAT) and Chicamocha (CHI) river valleys as well as Tayrona

ACK N OWLED G M ENTS
The authors wish to thank the Colombian companies Ecopetrol and Empresas Publicas de Medellin, the Government of the Colombian department of Antioquia, the CGIAR Fund Donors (https ://www. cgiar.org/funde rs/), and the CGIAR research program on Forest Trees and Agroforestry for financial support. We are grateful for the comments of the handling Editor and two anonymous reviewers which considerably helped to improve the manuscript. The authors declare no competing interests.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
ET, LGMH, and CAC designed the study; CAAM, CAC, and LGMH carried out field work; CAAM and JG carried out laboratory work; ET and CAAM performed data curation and statistical analyses and prepared the first draft of the manuscript. All authors contributed to revisions of the manuscript.