Incorporating phylogenetic information for the definition of floristic districts in hyperdiverse Amazon forests: Implications for conservation

Abstract Using complementary metrics to evaluate phylogenetic diversity can facilitate the delimitation of floristic units and conservation priority areas. In this study, we describe the spatial patterns of phylogenetic alpha and beta diversity, phylogenetic endemism, and evolutionary distinctiveness of the hyperdiverse Ecuador Amazon forests and define priority areas for conservation. We established a network of 62 one‐hectare plots in terra firme forests of Ecuadorian Amazon. In these plots, we tagged, collected, and identified every single adult tree with dbh ≥10 cm. These data were combined with a regional community phylogenetic tree to calculate different phylogenetic diversity (PD) metrics in order to create spatial models. We used Loess regression to estimate the spatial variation of taxonomic and phylogenetic beta diversity as well as phylogenetic endemism and evolutionary distinctiveness. We found evidence for the definition of three floristic districts in the Ecuadorian Amazon, supported by both taxonomic and phylogenetic diversity data. Areas with high levels of phylogenetic endemism and evolutionary distinctiveness in Ecuadorian Amazon forests are unprotected. Furthermore, these areas are severely threatened by proposed plans of oil and mining extraction at large scales and should be prioritized in conservation planning for this region.

Located within the South America's Piedmonte del Napo region, the Ecuadorian Amazon has been recognized as one of the most biodiverse areas around the world (Bass et al., 2010;Funk, Caminer, & Ron, 2012;Myers, Mittermeier, Mittermeier, Da Fonseca, & Kent, 2000) and is especially famous for possessing the highest levels of tree and shrub diversity across the Amazon basin (Pitman et al., 2001;ter Steege et al., 2013ter Steege et al., , 2016Valencia et al.,2004). Floristic inventories in the Ecuadorian Amazon have also been influential in our understanding of the concept of hyperdominance and patterns of relative abundance of species in the Amazon as well as floristic disruptions triggered by geology (Higgins et al., 2011;Pitman et al. 2008), suggesting that the assembly of the lowland Amazonian tree flora is the result of the interplay between edaphic specialization mediated by geological history and oligarchic tree communities. However, besides these efforts to determine both floristic and abundance patterns in Ecuador Amazon tree flora (Macía & Svenning, 2005;Pitman, Jorgensen, Williams, Leon-Yanez, & Valencia, 2002;Pitman et al., 2001;Valencia et al., 2004), our understanding of the Ecuadorian Amazonian flora is quite limited due to significant geographic gaps in floristic assessments across the region. To date, the most complete floristic assessment of the Ecuadorian Amazon used both herbarium data and a one-hectare plot network to delineate four floristic subregions (Guevara et al., 2016a). However, there has been no systematic attempt to define floristic regions using approaches that include both compositional and phylogenetic diversity, which is likely to provide additional insights to improve researchbased conservation policies Honorio.
In his pioneering work, Faith (1992) posited the concept of phylogenetic diversity as the sum of branch lengths of a phylogenetic tree along a minimum spanning path connecting the tips of the tree present in a location to its root. This measure has been the cornerstone of subsequent methods looking for the identification of regions of highphylogenetic endemism and/or evolutionary distinctiveness (Forest et al., 2007;Mishler et al., 2014;Redding & Moers, 2006;Rosauer, Laffan, Crisp, Donnellan, & Cook, 2009). Applied in a biogeographicalconservation context PD provides a way to detect regions that contain assemblages of species that share the same evolutionary history and help us to elucidate the historical events that may have shaped these assemblages (Kraft, Baldwin, & Ackerly, 2010;Whittaker et al., 2005). Recent works have developed indexes such as Phylogenetic Endemism (WPE) defined as the sum of the branch lengths' geographic range that a clade of the regional phylogenetic tree occupies in a particular region (Rosauer et al., 2009). Because phylogenetic endemism works as an analogy of weighted endemism described as a relative measure of endemism, we can use this index to better understand floristic changes across regions and simultaneously define conservation priority areas more effectively than using taxonomy alone (Laffan, Lubarsky, & Rosauer, 2010;Li et al., 2015).
Here, we present the results of an extensive one-hectare plot network that represents the most comprehensive spatial sampling of the trees of the Ecuadorian Amazon to date in order to evaluate the patterns of floristic affinities in this hyperdiverse region and provide insights into conservation priorities from a phylogenetic context. In addition, we address the following questions: (i) What floristic classification of the Ecuadorian Amazon do our results support? (ii) To what extent are differences in species composition (taxonomic dissimilarity) across the region congruent with differences in phylogenetic composition (phylogenetic dissimilarity)? (iii) Are regions with high-phylogenetic diversity (PD) areas with extraordinary evolutionary distinctiveness or endemism? (iv) Are areas characterized by high PD currently under formal conservation protection?

| Study area
Our study was carried out in the lowland Ecuadorian Amazon Toward the northern portion of Yasuní National Park, the interfluvial landscape is mostly dominated by rolling hills interrupted by terrain depressions or baixios that vary in extent and levels of drainage (Pitman 2000). This landscape is interrupted by the Napo River that divides the most northern portion of the Ecuadorian Amazon from the rest. High and low terraces from Pleistocene origin dominate the northern and southern riverbanks of the Aguarico River, whereas the northern riverbank of the Napo River mainly consists of palmdominated swamps (Ministerio del Ambiente del Ecuador, 2013).
The Pastaza River represents a geomorphological break in the landscape of Ecuador Amazon. South of this river the landscape is characterized by extensive plains of terra firme forests interspersed by swamps that are sometimes but not always dominated by palms. This area is known as the Pastaza fan which corresponds to a massive volcaniclastic alluvial fan deposited during the Holocene (Rasanen et al. 1987;Bernal et al. 2011). Finally, we sampled the lowland forests adjacent to the Cordillera del Condor, which is one of the areas of the Ecuadorian Amazon that remains most poorly explored in terms of floristic inventories. We sampled one plateau at 300-400 m on quarzitic sandstones (white sands) that represents the lowest altitude of Cordillera del Condor in Ecuadorian Amazon and also the first record of white-sand habitats for the lowland Amazon of Ecuador (The correct citation is Ministerio del Ambiente del Ecuador, 2013.).

| Tree community data
We established a network of 62 one-hectare plots from 2000 to 2016 in the Ecuadorian Amazon including terra firme and white-sand forests ( Figure 1 perform phylogenetic and statistical analyses, we excluded unnamed morphospecies, which have been demonstrated to have weak effects on the detection of ecological patterns (Lennon, Koleff, Grenwood, & Gaston, 2001;Lennon, Koleff, Grenwoow, & Gaston, 2004;Pos et al., 2014).

| Phylogenetic tree
We created a phylogenetic tree for 1,687 operational taxonomic units (OTUs) using as backbone the tree R20120829 (Li et al., 2015) from Phylomatic (Webb & Donoghue, 2005), which is based on the Angiosperm Phylogeny Group's system (APGIII, 2009). In order to assign branch lengths, we used the BLADJ algorithm in Phylocom (Webb, Ackerly, & Kembel, 2008) based on inferred nodes ages (Wikström, Savolainen, & Chase, 2001). Despite the fact that our regional phylogenetic tree is not fully resolved, recent studies have demonstrated that there is no significant difference between supertrees based on inferred node ages and trees using DNA in order to detect patterns at community or regional scale (Swenson, 2009).

| Taxonomic and phylogenetic alpha diversity metrics
To estimate species diversity at each location/plot, we used Fisher's alpha index which calculates the number of species in a sample relative to the number of individuals therein based on the following formula: Where S is the number of species, FA is the Fisher's value per assemblage, and N is the number of individuals per plot. We used the Fisher's alpha index (α) based on two basic assumptions: The first one implies that tree species abundances usually follow a log series distribution and secondly the regional species pool is spatially homogeneous. Based on previous evidence, we can argue the first assumption is fulfilled (ter Steege et al., 2013), while the second assumption is still matter of debate but could be a good approximation for the Ecuadorian Amazon forests (Pitman et al.,2002). In addition, Fisher's alpha is a scale-independent estimator that has a good discriminatory power to detect richness under the assumption that the number of species tends to infinity (Schulte et al. 2005).
In order to evaluate the standardized effect size of PD in each local community, we calculated the ses.mpd value for each plot using the independent swap algorithm as the null model (Gotelli, 2000) implemented in the "picante" package in R . This metric measures the standardized effect of mean pairwise phylogenetic distance between communities. Positive values over a 1.96 confidence interval determine communities were mainly structured by more closely related species (phylogenetic clustering) than expected by chance, and negative values less than −1.96 confidence interval were communities assembled by more distantly related species than expected by chance (overdispersion) (Webb 2000)

| Taxonomic and phylogenetic beta diversity
Investigating how phylogenetic relatedness among communities' changes across environmental and spatial gradients allows us to make inferences about the different biogeographical histories of regional species pools with the strong analytical power of phylogenies (Graham & Fine, 2008). For instance, high levels of Taxonomic Beta Diversity can be congruent with high levels of Phylogenetic Beta Diversity if allopatric speciation by vicariance has promoted geographical separation of two areas for long periods of time, which in turn has led to long disparate evolutionary histories of communities in both areas. Conversely, high levels of TBD can be related to low PBD indicating that recent events of speciation via parapatry or sympatry may be the drivers of community assembly. We must also consider that species abundances might be correlated with phylogeny if traits associated to habitat specialization allow species of one or few clades to become abundant in a particular habitat or region. Abundance-weighted phylogenetic metrics are essential to understand whether PD is concentrated in few dominant clades that would represent a great proportion of regional floras and therefore predictors of floristic breaks among regions.
TBD was calculated as the taxonomic dissimilarity between pairs of local communities (1-Sorenson index), whereas PBD was calculated with the Phylo Sorenson index as a measure of the degree of phylogenetic relatedness between pairs of local communities. In order to be consistent with the metrics used to evaluate taxonomic beta diversity, we used the complement of the Phylo Sorenson index to establish a phylogenetic dissimilarity metric (1-Phylo Sorenson) (Bryant et al., 2008;Graham, Parra, Rahbeck, & Mcguire, 2009).
In order to test whether TBD is a good predictor of PBD, we compared the observed and expected values of PBD. In order to do this, we calculated the expected values of PBD based on a null model that makes random draws from the regional species pool (here defined as the total number of species in our plot network).
This null model randomizes the community data matrix with the independent swap algorithm developed by Gotelli (2000) Where n i is the number of lineages originating at node k of v nodes in the set s (T,k,r). This is the number of nodes between node k and the r root in the tree T, meanwhile ^n i is the expected abundance of species i.

| Spatial models with taxonomic and phylogenetic diversity metrics
Several software packages for the spatial analysis of biodiversity have been developed in the past 10 years (e.g., Biodiverse, GDM) (Ferrier, Manion, Elith, & Richardson, 2007;Laffan et al., 2010), radically changing and improving our understanding of the spatial distribution of both taxonomic and phylogenetic diversity. The great majority of these analyses use a moving window approach that predefine a window around a group (e.g., site collection, plots) in a dataset to then calculate appropriate statistics for each group based on the neighborhoods that fall within such window (Laffan et al., 2010). However, as a caveat one must consider that when there is not complete spatial coverage within a region there is no way to predict values of taxonomic and phylogenetic turnover across space. Therefore, we used a different approach to predict the spatial variation of both taxonomic and phylogenetic beta diversity and abundance-based metrics for taxonomic and phylogenetic diversity.
In order to perform this analysis, we divided the Ecuadorian Amazon into 0.5 degree grid cells (55 × 55 km) which is a spatial scale that allows us to have a balance between accuracy and detail when performing the spatial analysis (Kreft & Jetz, 2010;Keil et al. 2012). It has been demonstrated that grain size affects beta diversity estimations and that increasing grain size should produce lower beta diversity in high species richness areas (Lennon et al., 2001;Keil et al. 2012). This is mainly determined by the fact that there is an intrinsic relationship between the SAR and species turnover. In other words by increasing the grain size, there is less room for variation in species composition because more of the regional species pool is being accounted for (Lennon et al., 2001). On the other hand by reducing the grain size, we would increase the number of grid cells containing plots in contrasting habitats (terra firme vs. white sands) therefore overestimating the predicted values of both beta and phylogenetic beta diversity (Keil et al. 2012 (Kreft & Jetz, 2010, Keil et al. 2012. In order to avoid these bias and because our data are not presence-absence records of each grid cell we calculated the mean values of both PBD and TBD for each plot with respect any other in the plot network. Then we used these average values to perform interpolation across the region. A Loess spatial regression model was used to predict both taxonomic and phylogenetic turnover. To obtain the most accurate fit, we used default parameters for our Loess regression: a 0.75 span was used to find the best smoothing average, and a degree 2 polynomial was set to reduce variance. We In order to perform this, the Loess method sets the size of the neighborhood with respect to location x with the parameter α. All the analyses were performed with the packages picante , vegan (Oksanen et al.,2015) and using custom functions on the R platform.

| Alpha diversity patterns
The highest Fisher's alpha values were found in a cluster of plots at the intersection of a latitudinal band between .5 and .8 degrees and a longitudinal band between 76 and 76.5 degrees (Fig. S1). This peak of taxonomic diversity is congruent with peaks of phylogenetic alpha diversity across the region (Fig. S1, Table 1).

| Floristic affinities in the Ecuadorian Amazon
Taxonomic These regions correspond to the forests located in the interfluvial areas between Aguarico-Putumayo basin (Aguarico-Putumayo basin), the interfluvial areas between the Napo and Pastaza rivers and the Cordillera del Condor lowlands (Figures 1 and 2A).

MRPP analysis based on Phylosorenson values support the delim-
itation of three floristically distinct units as shown by the delta values (Table 1). Thus, there is highly significant difference between groups of sites according to the biogeographical subdivision supporting the delimitation of three floristic subregions in Ecuador Amazon (Table 1).

| Beta diversity patterns
The spatial distribution of taxonomic and phylogenetic beta diversity was very similar. We found a tight correlation between TBD and PBD (r = .9043, p ≤ .001), which indicates that phylogenetic dissimilarity can be predicted by taxonomy (Table 1, Fig. S2). Nevertheless, we found a weaker correlation between taxonomy and phylogeny when the standardized ses.mpd index was included in analysis (r = .3016, p = .002). When comparing the observed values of phylogenetic turnover against the expected values based on our null model we found lower observed phylogenetic turnover than expected (Fig. S2).

| Evolutionary distinctiveness and phylogenetic endemism
High WPE values were concentrated in areas such as Condor Cordillera AED is considered, with low-AED values concentrated in areas that correspond to Cordillera del Condor region ( Figure 3F).
We also found significant differences in the spatial distribution of Imbalance of Abundances at Clade level (IAC) ( Figure 3E). This is confirmed with the spatial distribution of abundances across the Ecuadorian Amazon, as there is a disproportionate dominance of clades such as Arecaceae, Moraceae, Fabaceae, or Myristicaceae in areas of the Napo-Pastaza basin. We also found higher than predicted IAC values in regions that correspond to the lowland of Cordillera del Condor and some areas of the Pastaza fan ( Figure 3E).

| Floristic patterns
Our results improve a previous classification of the floristic relationships in Ecuadorian Amazon (Guevara et al. 2016), which delimited four floristic regions. We argue that previous regionalization was made on the basis of arbitrary boundaries to delimitate distinct floristic units without any statistical support. The main difference is the strong floristic affinities between the previously separated Pastaza basin and Napo-Curaray basin. While our ordination does show some degree of overlapping between the Napo-Pastaza and the Aguarico-Putumayo basins, we argue that the differences between the mean dissimilarity for each group centroid are enough to consider them as different floristic units. This is confirmed with the results of the multiresponse permutation procedure (Table 1). Because, this method allows us to deal with increasing community heterogeneity and also can help to correct the loss of sensitivity due to this fact we argue our results address properly the inherent high variation in species composition between sample units (plots).
Some groups such as Inga, Ocotea, Pouteria, Virola, Eugenia, and Calyptranthes are species-rich genera that exhibit peaks of diversity in Yasuní National Park. The spatial distribution of phylogenetic beta

| Can PBD be predicted by TBD?
Our results highlight the benefits of the use of complementary phylogenetic methods to determine strong turnover in floristic composition and also their importance for conservation purposes. We found that the observed levels of lineages turnover (PBD) are significantly lower than expected. A similar pattern has been found in two regional analyses of North American Angiosperms and whitesand forests across the Amazon basin (Guevara et al., 2016a,b;Qian, Swenson, & Zhan, 2013). Lower PBD than TBD may be the result of the spatial turnover of species that are nested in similar clades which in turn leads to floras mainly composed of the same phylogenetic components. Our results support the hypothesis that PBD can be predicted by TBD, and lower PBD than expected based on null TBD may be suggestive of recent divergence across strong environmental gradients or biogeographic boundaries promoting speciation for subsets of regional species pool (Graham et al., 2009). Moreover, the predicted spatial distribution of PBD not only represents spatial variability in lineage composition but should also represent variability in the set of traits for subsets of the regional species pool. This suggests a potential scenario in which parapatric speciation might be a general process shaping Amazon forest composition. Nonetheless, current evidence suggests that allopatric speciation after dispersal might be a major evolutionary driver of speciation in Amazon tree lineages (Dexter et al., 2017). Therefore, it will be important to carry out subsequent research at clades levels to elucidate whether these can be considered general mechanisms for the formation of species pool in Amazonian forests .

| Are regions with high levels of PD areas with high levels of evolutionary distinctiveness and endemism?
The spatial distribution of WPE and PBD determined that communities located in Cordillera del Condor lowlands may be characterized by high levels of WPE and PBD meaning that there is a high replacement of lineages with short-geographic ranges compared with communities in the other floristic districts of Ecuador Amazon ( Figure 3B,D). High levels of WPE can be explained by the presence of white-sand specialist taxa recently diverged from adjacent terra firme sister clades (Fine et al. 2013;Misiewicz & Fine, 2014).
Low levels of AED are also congruent with this scenario because individuals corresponding to species and clades sharing low evolutionary distinctiveness may be dominant in this region ( Figure 3F).  (Fine, Garcia Villacorta, Pitman, Mesones, & Kembel, 2010). Some potential mechanisms appear to be responsible of the pattern we found, parapatric speciation across gradients of soils might trigger speciation if divergent selection promotes adaptations to different extremes of a soil gradient (Fine et al., 2013). This process could occur more rapidly than in allopatric populations if the differences in soils are extreme enough to inhibit gene flow across soil boundaries (Coyne and Orr 2004).

| Implications for conservation
The inclusion of an evolutionary approach in any analysis of beta diversity can contribute significantly to scientific research-based conservation policies. Because species-centric conservation research solely takes into consideration a snapshot of the fractal nature of the tree of life without including phylogenetic data we miss all the information that genealogical relationships between organisms can give us. Currently, many conservation priority-setting exercises tend to be solely focused on species-level data and have proved to be a poor predictor of both species richness and threatened species identification (Orme et al., 2005). We found that despite a high correlation between species richness and PD, the predicted spatial distribution that incorporates phylogenetic information shows critical new de-  (Sandel et al., 2011). Because changes in climate have been correlated with high-extinction risk in several taxonomic groups, we argue that this phenomenon could lead to high-extinction levels in the southernmost part of the Ecuadorian Amazonia.
Most of the evolutionary lineages contained in the regional spe- Recently, Lessmann, Munoz, and Bonaccorso (2014) assigned a low-to-medium range in conservation priority to areas that correspond to the southern floristic districts we described here as regions containing both unique and geographically restricted evolutionary information. The approach used by these authors to define conservation priorities areas included richness maps based on species distribution models and maps of environmental vulnerability. However, we think that our results represent significant improvements upon these models. Here, we have demonstrated that areas with low AED values could be assigned as areas of mid-tohigh levels of priority in a conservation context if the same areas exhibit high values of WPE, PBD. Moreover, we have shown that areas characterized by the dominance of recently diverged lineages with restricted ranges correspond to floristically unique units located toward the south of the Ecuadorian Amazon. Our finding that the turnover in species composition in areas with high endemism is due to species with low-phylogenetic distinctiveness, suggests that recent speciation has led to high-beta diversity. This is consistent with a model by which speciation processes are highly dynamic and correspond to the evolution of habitat diversity and/ or climate changes during the Pleistocene, in the last 2 million of years. We argue that conservation of these areas is particularly critical in order to maximize the preservation of the evolutionary processes that underlie the origin of Ecuador's extraordinarily high tree diversity. This highlights the necessity to develop new conservation plans for this region taking into account the current and potential pervasive negative effects of mining, dam construction, and oil extraction.