Mapping species richness and evolutionary regions of the genus Myrcia

Myrcia is a plant genus exclusive of the Neotropical region and one of the most important components of the tree diversity in the Cerrado and Atlantic Forest domains. However, no mapping of its taxonomic diversity for this region exists. Our aim was to describe the spatial patterns of diversity and biogeographical history for Myrcia in the Neotropical region based on a phylogenetic regionalization.


| INTRODUC TI ON
The species of the family Myrtaceae occurring in the Neotropics belong to the monophyletic tribe Myrteae, with few exceptions (Vasconcelos et al., 2017), which originated in eastern Gondwana around 40 Ma, in current day Australia, New Zealand, and New Caledonia (Vasconcelos et al., 2017).The most likely dispersal route for Myrteae to South America was through Antarctica, which had a cool climate until 30 Ma, allowing the existence of temperate forests (Ortiz-Jaureguizar & Cladera, 2006) Myrcia is the second largest genera in Myrteae with ca 770 species (The World Checklist of Vascular Plants WCVP, 2022) and occurs exclusively in the Neotropical region (Lucas et al., 2018).Myrcia is a monophyletic genus that currently includes the former genera Calyptranthes, Gomidesia, Marlierea and their synonyms (Lucas et al., 2011(Lucas et al., , 2018;;Santos et al., 2017).Myrcia originated around 28 Ma in South America (Vasconcelos et al., 2017), probably in the Atlantic Forest, where the main diversification is likely to have occurred (Amorim et al., 2019;Santos et al., 2017).However, some sections (clades) within Myrcia had important diversification in other regions, such as Aguava in Cerrado (Lima et al., 2021), Myrcia, Calyptranthes and Aulomyrcia in the Amazon (dos Santos et al., 2021;Santos et al., 2017), Calyptranthes in the Caribbean region and Aulomyrcia in the Guiana region (Santos et al., 2017).
Myrcia is one of the most important genera of Myrtaceae in the Atlantic Forest and Cerrado domains (Duarte et al., 2014;Oliveira-Filho & Fontes, 2000) having the most tree species in Cerrado and being the second species-rich genus in the low-altitudinal rainforest and high elevation semi-deciduous forests in the Atlantic Forest (Oliveira-Filho & Fontes, 2000).Both these domains contain Myrcia highest species diversity, although it also shows high species richness in the Amazon and Caribbean regions (Lucas et al., 2011(Lucas et al., , 2018)).Moreover, Myrcia has high ecological importance, as its flowers provide pollen for insects, mostly bees, and its fresh fruits provide food resources for birds, primates and other mammals (Gressler et al., 2006;Staggemeier et al., 2017).
Despite the ecological importance of the genus, its high richness and its evolutionary history exclusive of the Neotropical region, no mapping of Myrcia diversity in the Neotropical region is currently available.In the Atlantic Forest, Murray-Smith et al. (2009) used Myrcia to map the areas of endemism and as a surrogate for angiosperm diversity.However, as the study focussed on the Atlantic Forest only, it did not provide a full overview of Myrcia's diversity across its entire range, namely for Cerrado, where Myrcia is the most diverse genus for tree diversity (Oliveira-Filho & Fontes, 2000).This scarcity of studies is due, in part, to the low availability of species range maps for plants (but see BIEN; Maitner et al., 2018).Currently, the most reliable information about the distribution of plant species is given only on a coarse scale, for example, across large regions, as is the case in Plants of The World Online (POWO-from Kew Gardens), or at the level of Brazilian States, as is the case of Flora do Brasil (http:// flora dobra sil.jbrj.gov.br/ ).However, this coarse-level information does not allow to accurately describe spatial patterns of diversity nor to understand its drivers.On the other hand, there is a large number of georeferenced species occurrence records available in open databases (such as the Global Biodiversity Information Facility; GBIF).While these occurrence databases provide essential information to describe diversity patterns, they need to be used with caution, including efforts to ensure reliability and accuracy regarding the location and identification of the specimens (Chapman, 2005;Rodrigues et al., 2022).
In addition to mapping species richness patterns, it is important to understand where the main areas for diversification within a genus are.Usually, historical biogeographic studies use predefined biogeographic regions (e.g.ecoregions; Olson et al., 2001) to reconstruct ancestral states of a particular clade, and to understand patterns of speciation and lineage dispersal.Two key studies used that approach to understand the biogeographic history of (Amorim et al., 2019;Santos et al., 2017).However, the biogeographic regions are usually defined based on distribution of different taxa (mammals, reptiles, birds, plants) (Morrone, 2018), which may not accurately represent the regional boundaries for a particular group.Therefore, using biogeographic regionalization specific for a taxon might provide a better understanding of its geographical and historical distribution.One such approach for biogeographic regionalization is phylogenetic regionalization, which aims at finding the turnover boundaries in the phylogenetic composition (i.e.beta diversity) to define a region (Daru et al., 2020;Maestri & Duarte, 2020).Thus, by using such approach, we are able to find the evolutionary regions for a particular group (e.g.Carta et al., 2022) and also reconstruct the ancestral states of that group based on the specific regionalization (Maestri & Duarte, 2020).

Myrcia
Here, we calculate the distribution of 307 Myrcia species based on occurrence data using species distribution modelling and buffering depending on the number of reliable occurrence records for each species.We used the stacked species distributions to provide the first map of species richness and reveal the spatial patterns of the genus (and its clades) taxonomic diversity across the Neotropical region.Moreover, we propose a biogeographic regionalization for the genus and conduct an ancestral area reconstruction using that regionalization to understand the biogeographical history of Myrcia in the Neotropics.

| Occurrence data set
We used occurrence data from Rodrigues et al. (2022) which provides curated occurrence records for the species in tribe Myrteae (Myrtaceae) in Central and South America.We used this Myrteae data set (52,230 occurrences from 1125 species) to calculate sampling bias within the study region (described in detail below).
Further, from this data set, we selected only the Myrcia species to conduct the species distribution modelling, which included 20,786 occurrences from 362 species.We applied an environmental filter to the Myrcia occurrence dataset, which aims to remove occurrences with low distance in the environmental space.We performed environmental filtering because it enhances model performance in Species Distribution Models (SDMs) when compared to spatial thinning (Castellanos et al., 2019;Varela et al., 2014).We used this environmental filtering combined with the classification of the reliability of species identification to select the occurrence records with the most reliable species identification in each grid cell of the environmental space.This step was implemented in the 'env_grid_filter' function of the naturaList package (Rodrigues et al., 2022), based on the function provided previously by Varela et al. (2014).Thus, we cleaned our data set maintaining only the highest reliable species identification records in each grid cell of the environmental space.To define the size of the grid in the environmental space, we used 10 equal size bins for each environmental variable (described below) in the entire environmental space occupied by the genus Myrcia.After conducting the environmental filtering, the data set contained 7429 occurrences from 362 species (i.e.35.7% of the occurrences available before the environmental filtering).

| Environmental data
To define the environmental space, we used six environmental variables including four climatic and two soil variables.These environmental variables were used both for filtering the occurrence data (as described above) and for fitting the species distribution models.As climatic variables, we used CHELSA (Karger et al., 2017) bioclimatic layers at 10 m resolution downloaded from Paleoclim (Brown et al., 2018).We selected climatic variables that represent both annual and seasonal variability.Annual and seasonal climatic variables are important predictors of vegetation types and species composition turnover, since climatic limiting conditions, such as freezing and drought conditions, could present barriers for plant colonization (Bergamin et al., 2021;Duarte et al., 2014;Neves et al., 2015;Oliveira-Filho et al., 2015;Rezende et al., 2018Rezende et al., , 2021)).In addition to that, we included soil variables because it enhances the predictive performance for species distribution models in plant species (Zuquim et al., 2020), as well as improving the prediction of biomes in Brazil (Arruda et al., 2017), and is an important factor influencing plant assemblage (Pinho et al., 2018).
We selected Annual Mean Temperature (bio1), Temperature Seasonality (bio4), Precipitation of the Wettest Quarter (bio16), and Precipitation of the Driest Quarter (bio17).Additionally, we selected two soil variables, the Cation Exchange Capacity (CEC) in clay at the topsoil, which represents the nutrient availability in the soil, and the percentage of clay at the topsoil, which represents the physical characteristics for plant establishment.Soil data were downloaded from the Harmonized World Soil Database version 1.2 (Wieder et al., 2014).

| Species distribution modelling
We determined the distribution of Myrcia species using different methods, depending on the number of occurrences (n) in our filtered dataset.For species with n ≤ 5, we built a buffered polygon of 50 km radius around each occurrence point.For species with more than 5 occurrences, we fitted species distribution models with the Maxent algorithm version 3.4.3(Phillips et al., 2006), using tuning modelling (Anderson & Gonzalez, 2011;Kass et al., 2021;Merow et al., 2013), cloglog output (Elith et al., 2011), two cross-validation approaches, depending on the number of occurrences of the focal species (Kass et al., 2021;Pearson et al., 2007), and background points sampling that accounts for the sampling bias (Elith et al., 2011;Merow et al., 2013;Phillips et al., 2009).
The Maxent algorithm uses background points to inform the model about the available environmental space to the species.Therefore, changes in the distribution of the background points in Maxent are the same as changes in the prior of the species prediction (Elith et al., 2011;Merow et al., 2013).In other words, sampling background points using the same bias as the occurrence records is a way of controlling the effects of sample bias in the modelling (Elith et al., 2011;Phillips et al., 2009).To sample background points, we used the larger Myrteae dataset-these data represent the overall sampling effort since it contains the occurrence data for the species in the Myrtaceae family in South and Central America (Rodrigues et al., 2022).We used the sampbias package version 1.0.5 (Zizka et al., 2021) to generate a raster with bias in the sampling effort for the Myrteae dataset.Then, we used the 'randomPoints' function from the dismo package version 1.3-14 (Hijmans et al., 2022) to sample 20,000 background points, using the bias raster as the probability for sampling background points.When modelling each species' distribution, we used the background points within the species area for modelling, which is defined by a buffer around the occurrence points of each species.
We used two cross-validation approaches, depending on the number of occurrences of the focal species.For species with a number of occurrences between 6 and 24, we used the leaveone-out cross-validation (LOO; Pearson et al., 2007), and for species with 25 or more occurrences, we used spatial block crossvalidation with four bins (k = 4).All model fitting and evaluation were conducted using the ENMeval package version 2.0.4 (Kass et al., 2021).Since the extent of the region used to sample background points influences the results of modelling (VanDerWal et al., 2009), we applied a buffer around the presence points to delimit the region used for modelling.To define the size of the buffer we conducted a modelling exercise with a subset of the species (see Supplementary Material for details).The buffer size was defined as 700 km for the LOO approach and as 500 km for the spatial block approach.Then, for both approaches, we used this buffered region to select the background points and fit tuned Maxent models.
A tuning procedure was used to find the best model prediction and minimize overfitting.The Maxent algorithm has two hyperparameters that can be changed (tuned): regularization multipliers and features classes.Regularization is a hyperparameter intended to avoid (or minimize) overfitting (Elith et al., 2011;Merow et al., 2013), while features classes indicate how a variable is used in the model.
To select the best model for each species, we used a set of evaluation metrics.For the modelling using the spatial block approach, we first selected the models with delta AICc <2, then we kept the models with the highest continuous Boyce Index (CBI) average value, as well as the models with the lowest omission rate average using the 10 percentiles of the predicted value for presences (OR10) and with fewer model parameters.The Boyce index is a measure of model predictive performance designed for presenceonly models (Boyce et al., 2002;Sillero et al., 2021)-as is the case when using Maxent-while the omission rate is a measure of model overfitting (Radosavljevic & Anderson, 2014).Therefore, these evaluation metrics were used to select the models with better prediction performance and with the lowest level of overfitting.After selecting the best model (from 60 models) for each species, we kept only species with models with continuous Boyce index ≥0.5.
For the models with few occurrences' records, and since it is not possible to measure the CBI index using LOO cross-validation, we used AUC as a performance metric.Hence, we first selected the models with delta AICc <2 as above, and then the models with lowest omission rate average using the 10 percentiles of predicted value for presences (OR10), with the highest AUC average value, and with less model parameters.Further, we defined a threshold for good models, maintaining only the models with AUC > 0.7 and OR10 < 0.25.
Finally, we used a site-specific threshold-the pS-SDM + PRR method (Scherrer et al., 2018)-to produce a binary (presence/absence) prediction of species distribution.This method uses species richness prediction in combination to probability ranking rule to define the presence of a species in each grid cell.For this, we summed the presence probabilities (SPP) derived from each species distribution model to generate the species richness prediction for each grid cell.Further, the species with high presence probability in a grid cell were considered present in decreasing order until reaching the predicted species richness (SPP) for the grid cell (Scherrer et al., 2018).
These steps were done using the raster of species distribution predictions with 10 arc-minutes resolution.After the binarization, the raster for each species were rescaled to 0.5 degree of resolution, which was used in further analysis.

| Mapping species diversity
To produced maps of species richness for the Myrcia genus and its sections in the Neotropics, we counted the number of species occurring in each spatial grid cell of 0.5 degrees (approximately 50 km of length at the equator) based on the presence/absence distribution maps for 307 Myrcia species.We also produced species richness maps considering Myrcia sections as described by Lucas et al. (2018) and the phylogeny provided by Vasconcelos et al. (2020).

| Phylogenetic tree
We used the consensus phylogenetic tree constructed by Amorim  Vasconcelos et al. (2020).We updated the species names to match the taxonomy in the occurrence data and extracted the tips with matches between the phylogeny and our species data.

| Phylogenetic regionalization
We built a biogeographical regionalization for Myrcia to identify the evolutionary regions of the group and investigate the biogeographical history of the genus.For this, we used the phylogenetic regionalization method evoregion (Maestri & Duarte, 2020).The evoregion method aims at identifying the boundaries of phylogenetic turnover across biogeographic regions.We performed the regionalization using the 'calc_evoregion' function of the Herodotools package version 1.0.0 (Nakamura et al., 2023).We also used the 'calc_af-filiation_evoreg' function of the same package to identify transition zones, which represent areas where grid cells have low belonging to the evoregion they were assigned by the regionalization procedure (Maestri & Duarte, 2020).After defining the evoregions, we set the centroid of each evoregion in the phylogenetic space and calculated the Euclidean distance between them.Then we performed a hierarchical clustering (Ward method) and built a dendrogram to visualize the relationships (i.e.similarities) between evoregions regarding their phylogenetic composition.For the phylogenetic analyses, we used only the species for which we had an estimated distribution and that were available in the phylogeny: from the 307 species with distribution maps, only 96 were available in the phylogeny.

| Historical biogeography
Finally, we used the regionalization to reconstruct Myrcia's biogeographical history, using the BioGeoBEARS package version 1.1.3(Matzke, 2013) to estimate the ancestral area of the internal nodes of the phylogenetic tree.For this, we had to define the species range, that is, which evoregions each species currently occupies.
To identify species' core range, we considered that a species occupies an evoregion if at least 45% of the species distribution is within that evoregion.Therefore, the range of a species could span a maximum of 2 evoregions.For those species that did not occupy any region with at least 45% of its distribution, we applied a lower threshold of 25%, thus allowing the range of these species to span a maximum 4 regions.We modelled the ancestral areas using the six biogeographical models available in the BioGeoBEARS (DEC, DIVA-like and BAYAREA-like, with and without a jump dispersal parameter) and selected the best model based on the lowest AICc value.
To improve our understanding of the ecological factors involved in the biogeography history of Mycia, we tested for differences in environmental conditions (Mean Annual Temperature, Precipitation of the Driest Quarter, and Altitude) among evoregions using ANOVA.Precipitation of the Driest Quarter were transformed to log+1 and Altitude were log-transformed to guarantee normal distribution in the residuals of the ANOVA.We tested for spatial autocorrelation in the residuals of the ANOVA using Moran's I statistic test with 999 permutations and found that all environmental variables presented spatial autocorrelation (with Moran's I values >0.75).Thus, to account for spatial dependence, we used Generalized Least Square (GLS) models, followed by ANOVA test.For each environmental variable, we fitted five GLS models (nmle package version 3.1.162)(Pinheiro et al., 2023), using different correlation structures (exponential, gaussian, linear, rational quadratic, and spherical), and selected the model with lowest AIC.The GLS models were fitted using a random sample (10%) of the grid cells values to allow for feasible computational time.We repeated the model fitting 1000 times.We then calculated the median and standard deviation of the F-statistic and computed the p-value for the median F-statistic.Finally, we performed a post hoc Tukey's test to identify the differences between pairs of evoregions.
All the analyses were performed in R version 4.2.2 (R Core Team, 2022).The data and code necessary to reproduce the study are deposited in two repositories.The species distribution models, as well as the raster images for each predicted species distribution, are available at https:// doi.org/ 10. 5281/ zenodo.7861246, while data and codes for biogeographical regionalization, ancestral area reconstruction for Myrcia species and ANOVA tests presented in this study are available at https:// doi.org/ 10. 5281/ zenodo.786125.

| Species richness patterns
By using a combination of methods, we derived the distribution of 307 species of the genus Myrcia: 185 species using buffered radius around the points of occurrence, 62 using Maxent model with LOO cross-validation, and 60 using Maxent model with cross-validation using spatial blocks.The spatial distribution of the species richness showed a concentration of species near the east coast of South America, mostly in the Atlantic Forest domain but also in some highland areas in the Cerrado domain (Figure 1a).Regarding the different sections in the genus, all of them had species occurring in the Atlantic Forest, and most of the sections had a peak of species richness towards the east coast of South America (Figure 1bk).Moreover, all sections with narrower distribution-Gomidesia, Eugeniopsis, Clade 10, Sympodiomyrcia, and Reticulosae-were mostly restricted to the east coast of South America, in the Atlantic Forest dominion.The sections with broader distribution had a consistent pattern of showing a peak in species richness in the Atlantic Forest, but not restricted to that region.For example, Aguava had a peak in species richness ranging from the east coast towards the interior of the continent, including areas in the Atlantic Forest and Cerrado, while sections Myrcia, Calyptranthes, and Aulomyrcia presented high species richness distributed in the northern Amazon (Figure 1g,j,k), besides the Atlantic Forest.

| Biogeographic regionalization
The biogeographic regionalization using the evoregion method resulted in five evoregions for Myrcia (Figure 2), which we named A to E. Evoregion A (dark blue area in Figure 2a) covered the subtropical part of the Atlantic Forest and the Brazilian highlands.Evoregion B (light blue area in Figure 2a) covered the Atlantic coast, mostly the lowland area in the Atlantic Forest domain, and the interior of South America, mainly containing the Cerrado domain.Evoregion C (yellow area in Figure 2a) was the most heterogeneous region, comprising Central America, the Andes in the Equator and Colombia, the Guiana Shield, southwestern Amazon, and a few sites at the Caatinga region in the northeastern of South America.Evoregion D (dark brown area in Figure 2a) covered part of the western Amazon, while evoregion E (red area in Figure 2a) was placed in the central Amazon, including mostly the lowland portion of the Amazon.
Evoregions A and D, and B and C formed pairs of evoregions with similar phylogenetic composition, while evoregion E had the most different phylogenetic composition related to the other regions (Figure 2b).We also searched for transition zones based on our regionalization but we did not find any clear transition zone for the Myrcia genus (Figure S3).

| Biogeographical history
The ancestral reconstruction conducted using BioGeoBEARS had the DEC + jump dispersal as the best model (Table 1).The model estimated three parameters, d = 0.002, e = 1 × 10 −12 , and j = 0.074.We found differences among evoregions for the three environmental variables tested (Figure 4).Evoregion A had lower temperature compared to the others evoregions, which could be due to its high elevation and because it included the southernmost locations, having most of the sites in the subtropics; it also showed high precipitation in the driest quarter.Evoregion B had the lowest precipitation in the driest quarter, and high temperature, which was similar to Evoregions C and D, which in turn had high precipitation and low elevation.Evoregion E included the lower elevation and warmer places, with high precipitation in the driest period, although the difference in altitude is larger in relation to the other regions than for temperature.

| DISCUSS ION
We described the spatial diversity patterns of the genus Myrcia across the Neotropics, by providing species richness maps for the genus and its sections, and by presenting a phylogenetic regionalization which we interpret as distinct evolutionary regions (evoregions).Two of these evoregions-A and B-were the core areas of the Myrcia diversification.Moreover, it appears that the new climatic conditions after the mid-Miocene were responsible for the specialization of some clades in the evoregion A. We discuss each of these findings in the following paragraphs.Our results showed that the Atlantic Forest contains the main species richness of Myrcia in the Neotropics.In addition, the portion of Cerrado which presents high Myrcia richness is contiguous to the Atlantic Forest.That Cerrado area is part of the Brazilian Highlands, which has high altitude and heterogeneous topography.The pattern of species richness for Myrcia and its sections was consistent with the known importance of the genus for tree diversity in both domains (Lucas et al., 2018(Lucas et al., , 2019;;Murray-Smith et al., 2009;Oliveira-Filho & Fontes, 2000).In the Atlantic Forest, the highest species richness was found in the Bahia state, northeastern Brazil, and along the Serra do Mar mountain range in southeastern Brazil.These results are similar to the pattern of endemism of Myrcia in the Atlantic Forest (Murray-Smith et al., 2009).Furthermore, we showed that all sections of Myrcia have species occurring in the Atlantic Forest, which corroborate the proposition of using the Myrtaceae family as a model group for the diversity of this domain, as well as indicating that Myrcia could be used as a model group for the Atlantic Forest due to its high diversity and presence of all lineages of the genus, evidencing an evolutionary history linked to the domain (Lucas & Bünger, 2015;Murray-Smith et al., 2009).Following our maps of species richness, the main clades occupying the Cerrado are Aguava, Calyptranthes, and Myrcia.These are widespread clades, with species found in both dry and wet habitats (Lucas et al., 2018).
We provided a biogeographical regionalization for the genus Myrcia based on the turnover in the phylogenetic composition across the Neotropics.The regionalization highlighted a separation in the Neotropical region into two blocks, a southern block, containing evoregions A and B, where most of the diversification in the genus occurred, and a northern block containing evoregions C, D, and E. This southern-northern separation is consistent with the evolutionary history of the Myrteae clade, which has an austral origin and dispersed to South America from the western Gondwana via Antarctica (Thornhill et al., 2015;Vasconcelos et al., 2017).Furthermore, this pattern indicates a dispersal of the lineages in the genus from south to north of South America, as proposed in other studies (Amorim et al., 2019;Santos et al., 2017), but with continuous and widespread diversification in the southern block.On the other hand, the similarity in the phylogenetic composition showed another pattern in the relationship between evoregions.Evoregion A has similar composition to evoregion D, while evoregion B has similar composition to evoregion C.These relationships could be explained by the historical connections between the Atlantic and Amazon forests (Ledo & Colli, 2017), in which the shared lineages between evoregions A and D are related to the southeastern-northwestern connection route, and the shared lineages between evoregions B and C are related to the northeastern route.
The reconstruction of the biogeographic history in this study provides a different perspective on ancestral areas than previous studies (Amorim et al., 2019;Santos et al., 2017).The main distinction is due to the use of a phylogenetic regionalization in the biogeographic reconstruction.This enabled us to capture transition and specialization areas of some lineages into colder habitats because the boundaries of the evoregions were distinct of biome delimitations.Previous studies have indicated the origin of Myrcia in the Atlantic Forest (Amorim et al., 2019) or more specifically in the Montane Atlantic Forest (Santos et al., 2017).Our reconstruction showed evoregion B as the region of origin of the genus Myrcia, which covers large part of the costal Atlantic Forest.Therefore, our results together with those from previous studies seem to indicate that Myrcia has originated in the costal Atlantic Forest, with a further occupation of the evoregion A-an equivalent area of the Montane Atlantic Forest.
In the Oligocene, the epoch of origin of Myrcia, the Atlantic coast was covered by tropical rainforest (Morley, 2003), pointing to a likely origin of Myrcia on a tropical climate.After the mid-Miocene climatic optimum, the Earth experienced a cooling period (Graham, 2011) that together with the rising of Andes (Hoorn et al., 2010) et al., 2015).The main evidence we have in this regard is that, since an ancestral dispersed into evoregion A, its descendants have continued to diversify there.Therefore, lineages evolving in evoregion A seem to be a result of ecological selection towards cold-adapted lineages.We can also expect that, after mid-Miocene, lineages that have persisted in evoregion B should have been selected towards drought-adapted lineages.Future studies could test these hypotheses by incorporating functional traits related to cold and drought adaptation into phylogenetic comparative analysis and/or state-dependent diversification models.In addition, the regionalization and ancestral area reconstruction we performed are based on 96 species out of the c. 800 species in the genus, with probably low representation in the phylogeny of Amazonian species and sections Myrcia and Calyptranthes.Therefore, future analyses including more Myrcia species from these regions (e.g.Western Amazon) and clades are needed to enhance our understanding of the diversification patterns in the Amazon.
Our study did not require any fieldwork permission.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors have no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The analyses showed here are permanently archived in two Zenodo et al. (2019) and time calibrated by Vasconcelos et al. (2020).Amorim et al. (2019) used external transcribed spacer (ETS) and internal transcribed spacer (ITS) of the ribosomal nuclear region and seven plastid markers (matK, ndhF, psbA-trnH, rpl16, rps16-trnQ, rpl32-trnL, and trnL-trnF) from 253 species to reconstruct phylogenetic relationship within Myrcia genus and its realtives (Blepharocalix, Plinia and outgroups from tribe Syzygieae and Eucalypteae.They used both Maximum Likelihood (using rapid bootstrap algorithm) and Bayesian Inference methods (Amorim et al., 2019).The support value for nodes was considered high when bootstrapping was ≥70% and posterior probability was ≥0.95 (Amorim et al., 2019).For time calibrate the phylogenetic tree, Vasconcelos et al. (2020) used two calibration points based on Vasconcelos et al. (2017), one representing the neotropical Myrteae clade based on the oldest fossil records of Myrtacedeites verrucosus and a secondary calibration point at the Myrcia crown node.The phylogeny is provided as supplementary data in These parameters indicated that dispersal (d) was more important than area reduction (extinction, e) in the evolutionary history of the Myrcia lineages, with chances of occurring jump (j) dispersal.This model estimated that the most likely region for Myrcia's origin was evoregion B (Figure3; Figure S4), as the reconstruction indicates that the earlier ancestors of Myrcia lineages occupied only evoregion B. Between 20 and 10 Ma, some lineages dispersed to evoregion A and began to diversify there.For instance, clades Eugeniopsis, Tomentosae, Gomidesia, Sympodiomyrcia, and Reticulosae had important diversification in evoregion A, while, Aulomyrcia diversified almost completely within evoregion B, with just a clade (with only two species in the phylogeny) diversifying in evoregion E. In addition, evoregion C was important for the diversification of the section Myrcia and for a recent diversification in one clade of the section Aguava.The ancestral area reconstruction did not recover any diversification occurring within evoregion D. F I G U R E 1 Patterns of species richness of the genus Myrcia (a) and the sections (clades) within the genus (b-k) in the Neotropics.(a)-the map is based on all species from which we derived distributions; (b-k)-the maps are based on the subset of species present in the phylogeny for each clade.Each map (a-k) has independent richness scales, to highlight the patterns in each clade, rather than on the overall number of species.

F
Evoregions for Myrcia genus (a) and their relationship (b) based on phylogenetic composition.

(
Graham, 2011;Hughes et al., 2013).It seems that the modification of climate and the origin of novel ecosystems in South America, such as savannas, have contributed to the diversification and specialization of some Myrcia sections in evoregion A. Current climate indicates that the main distinction between evoregions A and B is that A is colder and B is drier.This difference in temperature and precipitation together with distinct phylogenetic composition suggests that niche characteristics of the main lineages differ between evoregions(Mittelbach & Schemske, 2015;Pyron presented the distribution of Myrcia diversity and diversification patterns based on species richness and phylogenetic composition across the Neotropics.The Atlantic Forest domain contains the highest species richness, with contiguous highlands in Cerrado also presenting high species richness.The regionalization and biogeographical reconstructions showed that the southern block of the Neotropical region was the main diversification arena for Myrcia, which appears to have originated in a tropical climate.Moreover, the regionalization based on the phylogenetic turnover allowed us to identify lineages that appear to have specialized into colder environments in the subtropics and the Brazilian highlands after the mid-Miocene.ACK N O WLE D G E M ENTS We thank Gabriel Nakamura, Fabricio Villalobos, José Alexandre Diniz-Filho, Renan Maestri, Thaís Vasconcelos and one anonymous reviewer for the discussions and suggested improvements on the manuscript.AVR was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES)-Finance Code 001 and Jane and Aatos Erkko Foundation.LD research was supported by a CNPq Productivity Fellowship (grant 307,527/2018-2).LD is member of the National Institute for Science and Technology (INCT) in Ecology, Evolution and Biodiversity Conservation, supported by Ministério da Ciência, Tecnologia, Inovações e Comunicações/ CNPq repositories.The data and scripts for the species distribution modelling are stored at https:// doi.org/ 10. 5281/ zenodo.7861246.The F I G U R E 4 Differences in annual mean temperature (a), precipitation of the driest quarter (b), and altitude (c) among the evoregions.Above each plot is shown the median of F-statistic (±standard deviation) and the p-value for the median value from 1000 iteration of spatial model fitting using 10% of the grid cells in each run.Equal symbols over each boxplot indicate that there is no difference between evoregions.Colour scheme for evoregions follows the same as in Figure2.Note that the y-axis of panels b and c are presented as log-scale.
reshaped the ecosystems in South America TA B L E 1 List of BioGeoBEARS models tested and parameter estimates.Models are ordered by AICc, showing the best model on top.Abbreviations: AICc, Akaike Information Criterion Corrected by sample size; d, dispersal rate; e, extinction rate; j, jump dispersal parameter; N param, number of parameters; Neg.LL, Negative Log-Likelihood; W. AICc, eighted AICc; ΔAICc, difference from the lowest AICc.
F I G U R E 3 Ancestral area reconstruction for genus Myrcia based on evoregions.Only the most likely ancestral area for each node is shown.Letters from A to E indicate the evoregions; when combined, those letters indicate a species range occupying more than one evoregion.Colours code for each range are presented in the figure legend; different colours over branches represent the most likely inherited range; colours over the tips of the phylogeny represent the current range of each species used in the ancestral reconstruction.Nodes with high support values (bootstrapping ≥70% and posterior probability ≥0.95) are identified with asterisks (following Amorim et al., 2019).