Climate change is predicted to impact the global distribution and richness of pines (genus Pinus) by 2070

Climate change is altering habitat suitability for many organisms and modifying species ranges at a global scale. Here we explored the impact of climate change on 112 pine species (Pinus), fundamental elements of Northern terrestrial ecosystems.

Therefore, detailed knowledge of the climatic niche of plant and animal species is crucial for predicting their response to ongoing climate change.
Species distribution models (SDMs) are a widely used approach to estimate species niches by relating field observations to environmental predictor variables (Booth et al., 2014;Guisan et al., 2017).
This approach can benefit from the consideration of some aspects that might influence the reliability of current predictions and future projections (Araújo et al., 2019;Araújo & Rahbek, 2006;Lyu et al., 2022).For example, sampling bias, which may originate from uneven sampling efforts across space, time and/or taxa, is common in biodiversity databases used for fitting SDMs such as the Global Biodiversity Information Facility (GBIF) (Meyer et al., 2015) and can influence predictions of habitat suitability (Beck et al., 2014;Merow et al., 2014).Similarly, data quality issues, such as those arising from imprecise positional accuracies or from plantations, can cause uncertainty.Selecting the correct environmental predictors is also of crucial importance, and although there is often a tendency to include as many predictor variables as possible, the avoidance of a high number of highly correlated predictors can improve SDMs (Brun et al., 2020;Guisan et al., 2017).Simultaneously, the inclusion of non-climatic relevant dimensions of a species' niche can improve SDM performance.For instance, in the case of plants, the incorporation of predictors such as soil properties seems to enhance predictive power (Hageer et al., 2017).However, it is also important to keep in mind that establishing an association between biological occurrences and environmental predictors is complex and that results might vary across algorithms (Brun et al., 2020;Merow et al., 2014;Peterson et al., 2018).Forecasts can also be strongly influenced by uncertainty in the projections of future climate scenarios (IPCC scenarios and global circulation models [GCMs]) (Goberville et al., 2015;Thuiller et al., 2019).Therefore, SDM accuracy and reliability depend not only on their robustness to known sources of error but also on their ability to incorporate and manage multiple sources of uncertainty (Araújo et al., 2019).Then, the modelled outcomes can be assembled to illustrate both the mean trends as well as variation around these means as possible outcomes of projected futures.
SDMs frequently assume that species' niches tend to remain unchanged over time, leading to correlation between the niches of related species, namely niche conservatism (Pearman et al., 2008).This, in turn, could hamper the response of species to cope with environmental change (Wiens et al., 2010;Wiens & Graham, 2005).
However, the generality of this assumption is unclear.Whether niche conservatism limits an organism's capacity to respond to novel conditions will ultimately depend on the characteristics (e.g., breadth) of the conserved niche and their correspondence with future environments.Another common assumption underlying most SDM approaches is the existence of environmental equilibrium, i.e., species occupy all the environmental space suitable to them, while no unsuitable area is occupied (Jump & Peñuelas, 2005).However, species are rarely in equilibrium with the environment, particularly with climate.Therefore, mismatches are likely between the climatic conditions existing in the current distribution of a species and the potential range of climatic conditions that are suitable, yet not explored due to other biotic and abiotic (non-climatic) factors.In other words, the fundamental niche is not fully occupied (Bocsi et al., 2016;Booth, 2017;Booth et al., 1988Booth et al., , 2015;;Booth & McMurtrie, 1988;Hortal et al., 2012;Lobo et al., 2010;Perret et al., 2019).Therefore, accounting for the capacity of organisms to cope with conditions not present in their current ranges can improve forecasts of future range dynamics (Araújo et al., 2013;Early & Sax, 2014;Maiorano et al., 2013;Schurr et al., 2012).New methods have been developed to consider biological variance into SDMs (e.g., through phenotypic and genotypic diversity; Aguirre-Liguori et al., 2021;Bush et al., 2016;D'Amen et al., 2013;Pearman et al., 2010;Serra-Varela et al., 2015;Smith et al., 2019).In particular, the consideration of species' evolutionary trends might provide useful information to overcome this limitation.For example, Morales-Castilla et al. (2017) used phylogenetic information to improve the fit of SDMs.However, their approach requires community data for SDM calibration; thus, its applicability is hampered by presence-only or single-species presence-absence data, such as those contained in databases like GBIF.A methodology that combines phylogenetic information and SDMs using this sort of presence data could therefore be more generalizable.Here, we put forward an approach based on the phylogenetic reconstruction of niches encompassing ecological differences between related species.This approach uses recent niche evolution to deduce climatic conditions that are not present in current ranges of species but might be still included in their fundamental niche, thus being potentially relevant in their response to climate change.It only requires reliable species-level phylogenies and presence data, which makes it potentially applicable to a wide variety of organisms.
Pinus is an adequate system to work within a SDM framework.
It is a genus of tractable size (112 species) the members of which are important components of Holarctic and, to a lesser extent, Subtropical forests.As a result, they are for the most part very well-studied and there is abundant and precise available data on the distribution of most species (Lyu et al., 2022;Richardson, 2000).
Pine species span a wide variety of biomes, including arid (desert or climate change, data uncertainty, distribution shifts, edaphoclimatic niche, GBIF, habitat suitability, IPCC, niche evolution, species distribution models semi-desert), humid (tropical or subtropical forests) and cold environments (e.g., alpine or taiga), and are often found on the timberline, which makes them potentially very sensitive to any climatic shift (Allen & Breshears, 1998;Richardson, 2000).In addition, the evolution of this group is well-studied and both a rich fossil record reaching 100 My and well-resolved phylogenies are available (Alvin, 1960;Parks et al., 2012;Ryberg et al., 2012;Saladin et al., 2017).The abundance, diversity and reliability of data make Pinus a good system to apply SDM methodologies.
Additionally, understanding the response of pines to climate change is pressing, since several species are key elements in ecosystems that are particularly sensitive to climate change (Christensen et al., 2007;Giorgi & Lionello, 2008;Seager et al., 2007).Many species occupy and define boreal ecosystems where the expected increase of temperature could favour a shift of their Southern edge towards Northern latitudes (Birch et al., 2019;Lloyd & Bunn, 2007;Thuiller et al., 2005).Similarly, pines occupy subalpine ecosystems, which represent "cold islands" surrounded by warmer areas, and risk "falling upwards" without the possibility of latitudinal or altitudinal migration (Krajick, 2004;Sánchez-Salguero et al., 2012;Thuiller et al., 2005).Several others are bound within very small ranges, like the island endemics P. canariensis and P. luchuensis and even small environmental changes might drive them to extinction in their native ranges (Harter et al., 2015).Additionally, several pine species appear to occur in places that are already close to the arid tree line, like Mediterranean or SW North American pines.The performance of these species is compromised by intense summer droughts, which are already causing mortality events and even distribution shifts in dry ecotones (Allen et al., 2010;Allen & Breshears, 1998), and are predicted to increase in frequency and severity in the near future (Giorgi & Lionello, 2008;Seager et al., 2007).Aridity is also expected to increase in tropical areas of America and Asia (Christensen et al., 2007;Hulme & Viner, 1998), which could increase water stress and mortality risk for (sub)tropical pines (Allen et al., 2015).Given the ecological and economic importance of this group, projecting the future suitability of global environments for Pinus spp.presents an urgent scientific challenge.
Here, we project global distribution patterns under climate change across the genus Pinus.To do this, we apply a novel approach that incorporates multiple layers of biological and environmental data, along with evolutionary information into SDMs.This approach is flexible to different biogeographic sources, so we coupled GBIF occurrences along with expert knowledge and developed a sampling process to limit the uncertainty caused by the variability in data quantity and quality.Modelling was performed using climate and soil variables and combining different algorithms.We also considered the output from several climate models and climate change scenarios to project habitat suitability under future climate conditions.Moreover, we applied a phylogenetic approach to include recent niche evolution into areas for which SDMs showed conflicting predictions.In that way, we expect to expand our ability to predict habitat suitability beyond climatic conditions included in extant distributions.Finally, we performed an independent validation of our models for 12% of the species using naturalized occurrences of invasive pine species.
Our approach predicts marked changes in habitat suitability under climate change for most pine species, suggesting the possibility of significant range shifts.However, model projections differed widely among lineages and, especially, biogeographic regions.
Based on our results, range loss is to be expected in regions where climate change is likely to result in increased aridity, such as the Mediterranean Basin or South North America.Conversely, newly suitable habitat space might open up for pines in cold areas where both precipitation and temperature are expected to increase, such as Boreal areas of Eurasia and North America, or the Tibetan Plateau.
The addition of recent evolutionary history entailed substantial changes in SDM predictions for a few, but not all, species.For instance, pines whose ancestors were inferred to occupy niches colder than their extant habitat were predicted to have more chances of migrating latitudinally.The independent analyses performed on naturalized occurrences endorsed the models for most of the species included in the validation and supported that the phylogenetic correction may be useful in some cases.Our work constitutes the first attempt to predict the impact of climate change across pines at a global scale considering recent evolutionary history.The results provide useful information for the conservation and management of pines, but also a new approach to consider recent evolution into SDMs requiring only presence data and a phylogeny of the group.

| ME THODS
A detailed flowchart showing the main steps of our analyses can be found in Appendix S1.
Maps from Critchfield and Little (1966) were digitized and rasterized at a 50 × 50 km resolution for further analyses, considering them as a representation of the natural geographic range of pine species based on reliable ecological expertise.We created a 1° buffer around the natural distribution of all species in order to reduce the probability of not covering the whole natural range of each species (see Appendix S2 for further details).
Note that the historical distribution of many pine species has been influenced by human activities over millennia (Richardson et al., 2007).This, however, should not negatively influence our models as the goal is to predict areas with climatic conditions suitable for pine species.We focused on regions known to include self-sustained populations of each species.Therefore, we trained our models to predict areas with climatic conditions suitable for recruitment, independently of other factors like, for example, human influence on dispersal.See the independent validation of the models for an assessment of our approach's ability to achieve this goal.

| Species occurrences
Occurrence data for each pine species were downloaded from the Biodiversity Information Facility (GBIF; www.gbif.org) using gbif {dismo} (Hijmans et al., 2017;Raymond et al., 2022).We adjusted the taxonomy applied in GBIF to ensure that it matched that of Gallien et al. (2016) (Appendix S3).To consider the uncertainty regarding the location of occurrence records, we included positional accuracy in our analyses.We used the GBIF variable "coordinatePrecision" instead of "coordinateUncertinityInMeters" because the latter treated as high precision points some occurrences falling clearly outside of forest patches (see Appendix S4 for further details).We considered as high precision points those occurrences with coordinate precision between 1 and 25, and all others as low precision.This information was later used to weight occurrences in the models (see Modelling section for details).In addition, precision was considered during the process of occurrence resampling.We performed a resampling procedure to reduce the influence of spatial autocorrelation and sampling bias on model performance (Beck et al., 2014).We established a maximum of 3 occurrences per cell (50 × 50 km) inside the natural range, discarding those occurrences outside that range.We prioritized high-precision points, which were selected using altitude as stratification criterion if more than 3 per cell were present to cover the climatic variability within cells.We created a low-precision random occurrence in those cells of the natural range with no occurrence record (see Appendix S5 for further details).This sampling procedure enabled us to combine two complementary sources of data while accounting for their reliability: GBIF, which includes high and low precision points but may lack data for the complete range of a species, and expert knowledge, which lacks precision but describes the entire natural distribution of species.

| Pseudo-absences
Pseudo-absences (i.e., locations of hypothesized absence) were created inside a buffer of 22.5° around the mapped distribution ranges, excluding said distributions to ensure that occurrences and pseudoabsences did not overlap.Inside this buffer zone, we performed a proportional stratified sampling of the space to cover all environmental combinations using the {ecospat} R package (Broennimann et al., 2016).Pseudo-absences were distributed across environmental strata in a number proportional to the stratum size.We assigned ten times as many pseudo-absences as occurrences per species, i.e., the ratio of presences/pseudo-absences was 1/10.This is considered a good presence/absence ratio to maximize model accuracy (Barbet-Massin et al., 2012).In species with few occurrences, we increased the number of pseudo-absences in order to ensure full coverage of all environmental strata (see Appendix S6 for further details).During model calibration, pseudo-absences were given lower weight than presences such that the total weights of pseudo-absences equalled the total weight of presences (see Modelling section for more details).Therefore, our models were based on data representing environmental conditions of confirmed occurrences and likely absences (outside the known distribution range) while also quantifying the reliability of observed presences.We used a buffer around species' natural ranges to create absences because we intended to include the explicit climatic and edaphic conditions in the vicinity of the confirmed distribution of each taxon.Using this approach, we also minimized potential biases introduced by globally acting factors such as dispersal limitation or historical constraints, i.e., we limited the influence of absences caused by non-edaphoclimatic factors (Lobo et al., 2010;Svenning et al., 2011;J. C. Svenning & Skov, 2004;Thuiller et al., 2004;VanDerWal et al., 2009).

| Predictor variables
We chose the target resolution for modelling pine distributions to be 10x10 km.We discarded finer resolutions because of the spatial scale used in this study (i.e., 50 × 50 km cells of natural ranges).
In cases where variables were available at higher (finer) resolution, we aggregated cells to 10 × 10 km resolution using the bilinear interpolation option of resample {raster} (Hijmans et al., 2017), i.e., by averaging cell values.We used 19 bioclimatic variables calculated with biovars in {dismo} (Appendix S7).These bioclimatic variables are originally based on minimum and maximum monthly temperatures as well as monthly precipitation sums.Instead of precipitation, we used a moisture index calculated as the difference between annual precipitation and potential evapotranspiration, which in turn was calculated from mean monthly temperature and global radiation (Turc, 1961;Zimmermann & Kienast, 1999).The basic climatic data were downloaded from WorldClim version 1.4 except for global radiation, which was only available in version 2.0 (Fick & Hijmans, 2017;Hijmans et al., 2005).
In order to cover the projection uncertainty that differences in climate models and scenarios are likely to cause, we computed our suitability maps considering 28 different combinations and four representative concentration pathways (RCPs: RCP26, RCP45, RCP60, RCP80) (Hijmans et al., 2005;Stocker et al., 2013).
Bioclimatic variables under each of these scenarios were derived from WorldClim (Hijmans et al., 2005) following the same approach used for current bioclimatic variables.Soil characteristics can be highly relevant to define plant niches.Thus, in addition to climatic conditions, we included physical and chemical properties of the soil from SoilGrids (https:// soilg rids.org/ ; see Appendix S7 for further details about the environmental variables).Note, however, that we assumed soil variables to remain unaltered under climate change.This is a limitation of our analyses given that soil properties could be affected by changes in climate (e.g., through modifications of precipitation regimes).We attempted to reduce its impact by establishing the same ratio of climate/soil variables for all species (see below).Moreover, the ability of soil variables to predict current distributions was much lower compared to climatic variables (Table S1 Appendix S7); thus, the impact of this limitation should not be very important.
Environmental variables tend to co-vary, which can impact SDM projections (Brun et al., 2020;Williams et al., 2012).To avoid multicollinearity problems, we performed a selection of predictors.First, we clustered species into groups according to which climate and soil variables best explained their distribution (see Appendix S8 for further details).Then, we performed the selection of variables within each group, considering the corresponding environmental data.
Predictors were selected based on their predictive power (Guisan & Zimmermann, 2000) and signs of collinearity (Heiberger, 2016).In order to avoid biases related to the number of soil variables, which are predicted to remain unaltered under our climate change projections, we established a fixed ratio between the number of climatic and soil variables across clusters (3 and 2, respectively).In the case of species with few occurrences, we further removed variables according to their predictive power and the climate/soil ratio to ensure a minimum of 10 occurrences per variable (Brun et al., 2020;Guisan et al., 2017) (see Appendix S9 for further details about variable selection).

| Modelling
We used different modelling methods in order to consider the uncertainty related to algorithm selection.We employed three algorithms representing different approaches: parametric (GLM; glm {stats}) and semi-parametric regressions (Generalized Additive Models, GAM; gam {gam}) (Hastie, 2018;R Core Team, 2017), along with a tree-based method (Random Forest, RF; randomForest {randomFor-est}) (Liaw & Wiener, 2002).In all models, precise and imprecise presences had a weight of 1 and 0.5, respectively.In addition, the weights of pseudo-absences were set such that the sum of absence weights equalled that of all presences (Barbet-Massin et al., 2012).
We randomly partitioned the data 12 times into training and evaluation datasets (70 and 30%, respectively) totalling 36 predictions of habitat suitability per species (3 model types × 12 data partitions).Continuous predictions of habitat suitability from GLM and GAM were binarized to combine them with the binary predictions from RF. Predictions were binarized using the best True Skill Statistics (TSS), which was obtained from each evaluation dataset (ecospat.max.tss{ecospat}) (Allouche et al., 2006;Broennimann et al., 2016).This method has been shown to produce highly accurate predictions compared to other approaches (Jiménez-Valverde & Lobo, 2007).For each species, binary predictions coming from all model-partition combinations were assembled.For each cell, we calculated the percentage of binary predictions that assigned the cell as suitable for a given species.The final ensemble shows the certainty of habitat suitability across modelling choices: high certainty of high and low suitability for 100 and 0%, respectively, while intermediate certainty values represent discrepancies among modelling choices.
For model evaluation, Kappa and, in the case of continuous predictions (GLM and GAM), the area under ROC curve (AUC) were calculated in each evaluation dataset (evaluate {dismo}).Since these two metrics are affected by the lack of equal proportion of presences and pseudo-absences (Golicher et al., 2012;Liu et al., 2013), we also evaluated models using TSS (Allouche et al., 2006;Liu et al., 2013).
Models were then projected to future conditions using the 28 combinations of climate change data, totalling 1008 projections per species (36 models × 28 climate scenarios).Projections of GLMs and GAMs were binarized using the same TSS thresholds used for current conditions.We assembled these 1008 binary projections for each species into a single raster as explained above for current predictions.See Appendix S10 for further details about the modelling approach.

| Phylogenetic analyses
We accounted for recent evolution in ecological preferences of pines as a way to incorporate niche space not occupied in the extant distribution of species.This is especially relevant for Pinus, a group in which climatic disequilibrium seems to be frequent, especially for species with small ranges (Perret et al., 2019).Therefore, it is possible that not all suitable climatic conditions are represented by current distributions.
Given the existence of conservatism for the climatic niche of Pinus (see below), we assumed that the fundamental niche of current species might still lie within (in niche space) ancestral niches recently occupied by their lineage.We approximated unexplored regions of the fundamental niche by reconstructing realized bioclimatic niches across the phylogeny.We used for that the two climatic variables that best explain pine distributions, one for temperature and another for humidity (BIO4 and BIO17; see below).
For each variable, the "phylogenetic range" of extant lineages was defined as the values encompassed between the current climatic value and that of the most recent common ancestor (MRCA).
In other words, we considered the variation in niche conditions since the last evolutionary divergence event.We expected that the inclusion of evolution until the MRCA limited the consideration of ancestral niches that may no longer be included in the fundamental niches of extant species.In addition, this approach makes the method easily reproducible and applicable to other systems.In order to incorporate these unexplored regions of the fundamental niche in our predictions, we used the phylogenetic climatic range to correct areas for which SDMs yielded uncertain habitat suitability by 2070.This includes habitats projected to be suitable by only 25%-75% of model-data combinations.In these areas with uncertainty, we calculated the proportion of climate change scenarios for which the expected climate of a given cell fell inside the phylogenetic range.In each climate change scenario, we only considered cells that fell inside the phylogenetic range of both climatic variables We ensured in that way to only included areas suitable for the highest possible number of dimensions of the climatic niche (temperature and humidity).We then modified the predictions of suitability by giving a high predicted suitability to those cells within the phylogenetic range, effectively including them in the realized climatic niche of the species.We only applied this phylogenetic correction to areas with ambiguous results to avoid confounding robust projections (Schluther et al., 1997).
Moreover, we ensured that our estimations gave more weight to the most recent evolution.For that, we scaled projections according to the "position" of the expected value under climate change within the estimated phylogenetic range.Those cells with an expected climate closer to the ancestral state of the node (MRCA) were penalized, and those with values close to current conditions were favoured.The rationale for this linear scaling is that we assume pine species to be more likely to retain past niches that are closer to the current state of their niche.This is supported by the fact that Pinus shows evidence of niche conservatism, and hence, pine niches seem to exhibit temporal autocorrelation (see below).Therefore, it is reasonable to assume niches reconstructed in deeper evolutionary times to be more distant and hence less likely to be retained as part of current fundamental niches.Values equal to the ancestral state were considered as 0 (in effect placing them outside the phylogenetic range) whilst those values equal to the current niche value were considered to fall fully within the phylogenetic range (i.e., were considered as 1).Cells with climatic values between current and ancestral states were assigned a value of suitability proportional to the distance to the extremes (i.e., linear scaling).The results across climate change scenarios were assembled into one single raster per species and converted to a proportion, i.e., from 0 to 1: Values closer to 1 means that the expected future climate of a given cell falls inside the phylogenetic range for multiple scenarios and it is close to the current state; values closer to 0 means that the expected climate does not fall within the range in multiple scenarios and/or it is far away from the current state (see Appendix S11 for further details about the specific steps of the phylogenetic correction).Therefore, cells inside the phylogenetic range and with conditions not very far away from those represented by the current niche were considered as suitable even if their expected climate is not present in the current distribution.Even if SDMs predict these cells as unsuitable under climate change, their climatic conditions could be still included in the unexplored space of current fundamental niches.
Phylogenetic analyses were performed using the fossil-dated phylogeny FBDl of Saladin et al. (2017).The ancestral climate niche was reconstructed along this phylogeny using two climate variables, one representative of adaptations to temperature conditions and the other as a proxy of adaptations to varying humidity.The variables were selected according to their predictive power in SDMs (Wiens et al., 2010) across all pines (i.e., were highly ranked to predict the distribution for multiple species; see Appendix S7).The chosen variables were "Temperature Seasonality" (BIO4) and "Humidity of Driest Quarter" (BIO17).The current niche state of these variables was estimated as the median across the natural range of each species.In other words, we considered all climatic values falling within the current distribution.Following Guerrero et al. (2013), we compared four evolutionary models for ancestral reconstruction, namely: White noise, Brownian motion (BM), Lambda (λ) and Ornstein-Uhlenbeck (OU).We found that both climatic traits (i.e., BIO4 and BIO17) exhibited λ values significantly different from 0, along with a low signal of selection (BM and OU models were indistinguishable).These results suggest the existence of niche conservatism in Pinus and support the selection of the simpler (i.e., with less parameters) BM models to perform the ancestral reconstruction (see Appendix S12 for further details about the selection of niche evolution models).In these reconstructions, we estimated the ancestral state for each climatic variable as the most likely climatic value for the common ancestor of each group of sister species.This ancestral state was estimated using ace {ape}.Ancestral niches were therefore based on the extant climatic values of the relevant sister taxa and also on those of the rest of species considering their phylogenetic relationship to the node of interest.
Because methods based on ancestral state reconstruction are fraught with uncertainty stemming from the phylogenetic tree used (Schluther et al., 1997), we also reconstructed the ancestral state of the two climatic variables under BM and OU considering other phylogenetic hypotheses, specifically the node-dated phylogeny NDbl from Saladin et al. (2017).Ancestral states under the two phylogenies were quite similar both for BIO4 and BIO17 (ρ > 0.9 and p < 2.2e−16 in both cases; Appendix S13).

| Estimation of range loss and change
We estimated range loss for each species as the proportion of current suitable area predicted to be lost under future conditions.
Current suitable area was calculated as the number of cells with a certainty of habitat suitability ≥75% across all 36 current predictions (3 model types × 12 data partitions).As explained in the Modelling section, these suitability values were obtained from the ensemble of binary predictions generated for each combination of model and data partition (Appendix S14).We used a threshold of 75%, so as to focus our predictions on the size of the area with high confidence of suitability for pine species.In other words, we selected only cells considered as suitable across a wide range of sources of uncertainty in order to prioritize areas more likely to be suitable in the future, which should be useful for management purposes.Predictions were limited to a 12.5° buffer around species distributions to avoid unrealistic predictions like the inclusion of land masses too far outside the recognized historic range of a species (Zurell et al., 2018).Then, we calculated the number of cells predicted to remain suitable under future conditions within the 12.5° buffer.We considered cells suitable according to 75% or more of the future projections (3 types of model × 12 random data partitions × 28 climate scenarios = 1008 future projections; Appendix S14).Then, the number of suitable cells under future conditions was divided by the number of current suitable cells to obtain a metric of range loss.Finally, the same calculation was performed also considering the cells that our phylogenetic correction rendered suitable under future conditions.However, we only considered cells with a phylo-corrected suitability ≥0.1 to avoid areas with expected climates outside the phylogenetic range in multiple scenarios and/or too close to the MRCA niche.In this way, we selected climatic conditions not very far away from the current state, thus being more likely present in the fundamental niche of current species.
Range change was estimated as the difference between the projected suitable area under future conditions and the current suitable area, divided by the latter (all areas are estimated as number of cells).
In this case, projections of future suitability were not limited to areas that are currently predicted to be suitable.Instead, we considered all areas included in the 12.5° buffer around species distributions.In that way, we considered not only the loss of current suitable areas but also potential increases of suitability in previously unsuitable areas.As in the case of range loss, suitable areas were assigned according to a threshold of 75% in suitability certainty and 0.1 of phylogenetic suitability (see Appendixes S15 and S16 for further details about range change/loss calculations).

| Changes in global species richness
Predictions under current and future conditions for all species were combined into global maps.First, the ensembles of projections under current and future conditions of each species were binarized using a threshold of 75% within the 12.5° buffer mentioned above.Binarized ensembles across the whole genus were then summed to obtain a prediction for the number of species in each cell, i.e., predicted pine richness under current and future conditions, respectively.In the case of future projections, this was repeated also considering as suitable those cells of each species with phylo-suitability ≥0.1.Finally, we calculated the difference in predicted pine richness between current and future conditions (see Appendix S17 for further details).
As previously explained, we considered as suitable those cells with a suitability certainty ≥75%.In this way, we focused on regions with a higher probability of being suitable in the future, which could be more relevant for prioritization in forest management.This, however, could have the side effect of discarding an excessive number of potentially suitable regions if the selected threshold is too high.Therefore, we explored the impact of threshold selection on range loss/change and pine richness predictions.We calculated these metrics across 101 thresholds (from 0 to 100).Results showed that only thresholds above 75% tended to produce extreme reductions in suitable area, being likely overconservative.This suggests that we have found a good compromise between selecting regions with a high certainty of suitability, which could be relevant for forest management, without discarding an excessive number of potentially suitable regions.See Appendix S18 for further details.

| Independent validation of the models
We performed an independent validation of our models (with and without phylogenetic information) by assessing their performance against previously unseen, independent data.We used an independent dataset with naturalized occurrences of invasive Pinus species across 5 continents (Perret et al., 2019).They included only data from self-sustained populations and outside the natural ranges.The use of naturalized occurrences is usually posited as useful to characterize the full range of climatic conditions a species can tolerate (Booth, 1991(Booth, , 2023)).Therefore, we could obtain relevant information about the ability of our models to capture that range of conditions through the evaluation of their performance beyond natural ranges.We applied the same processing of the occurrences as in the data used for the main analyses.This resulted in a cleaned dataset with naturalized occurrences outside the current ranges of 14 pine species.We used these occurrences to evaluate the previously fitted models of each species, partition and algorithm, totalling to 504 models (3 algorithms × 12 data partitions and 14 species).We performed the evaluation using the continuous Boyce index, a metric that has been shown to reliably assess the performance of presenceonly models (Boyce et al., 2002;Hirzel et al., 2006).We evaluated how well each model was able to predict as suitable those areas showing a higher proportion of naturalized occurrences.This metric ranges from −1 to 1, being 0 the random expectation, i.e., no correlation between suitability predictions of the model and the proportion of presences.We obtained a value of the Boyce index for each model using the function Boyce{modEvA} (Barbosa et al., 2013).We then tested whether the median Boyce index value across the 12 partitions of a given species and algorithm combination was significantly higher than 0 and 0.5 using a Wilcoxon signed rank test.The corresponding p-values were corrected for multiple comparison using the False Discovery Rate (FDR) and considering a FDR value of 0.05 as threshold (Benjamini & Hochberg, 1995).Finally, we repeated the calculation of the Boyce index, but also considered as suitable those cells within the phylogenetic range for a given species (phylo-suitability ≥0.1).We then tested for significant differences in the Boyce index with and without the phylogenetic correction using a paired Wilcoxon test.See Appendix S19 for further details about the independent model validation.

| Predictions under current conditions
In general, ensembles of predicted habitat suitability under current conditions fit species ranges, suggesting a good accuracy of SDMs.This is supported by all metrics of model evaluation.Kappa gave a mean value of 0.75 ± 0.12, 0.79 ± 0.11 and 0.74 ± 0.14 for GLM, GAM and RF, respectively (mean ± standard deviation across species).TSS showed even higher values with a mean value of 0.90 ± 0.06, 0.91 ± 0.06 and 0.91 ± 0.04 for GLM, GAM and RF, respectively.Finally, mean values of AUC were 0.98 ± 0.02 and 0.98 ± 0.01 for GLM and GAM, respectively.See Appendix S14 for the visualization of uncertainty in projections and evaluation metrics of all pine species.Model performance was not optimal in all cases.Some species with small ranges like those inhabiting islands had few independent points, thus the number of variables that could be used for analyses was lower than four, with as few as one in P. amamiana (a narrow Japanese endemism).In these cases, models were likely not accounting for all important niche dimensions, which could explain a comparatively lower performance.For example, P. amamiana showed kappa values of 0.39 ± 0.01, 0.42 ± 0.14 and 0.12 ± 0.03 for GLM, GAM and RF, respectively (Appendix S14).The models for some pines with broad ranges also showed only moderate performance.For instance, the models of P. contorta, one of the most widespread pines and native to North America, yielded kappa values of 0.53 ± 0.01, 0.61 ± 0.01 and 0.75 ± 0.01 for GLM, GAM and RF, respectively (Appendix S14).In these cases, the broad range of environmental conditions under which these species occur might have compromised model calibration.

| Range loss and change under future conditions
For most species, SDMs suggest a decrease of suitable area relative to current predicted suitability, i.e., part of their current range is predicted to become unsuitable (range loss; Figure 1).More than half of pine species (58%) are predicted to experience suitability reductions in more than 10% of their current predicted range.Similarly, 36% of species are predicted to suffer range losses of more than 20%.However, when the analyses are not limited to areas that are currently predicted to be suitable, and hence the possibility of range shift and expansion is considered (range change), only 21% of pine species (24/112) are predicted to experience a decrease ≥20% of total suitable area.This reduction in the number of affected species is likely due to increases of suitability in previously unsuitable areas, which can offset the range losses for some species (Figure 1).Around 21% of pine species are predicted to experience an increase in total suitable area, up to 40% for some species.Phylogenetic corrections moderately reduced the risk of complete range loss and even increased the probability of range expansion (i.e., positive values of range change; Figure 1).Under these conditions, the number of species with a predicted range loss of more than 20% of total suitable area was reduced from 36% to 27% (median range loss across species of 10.87 (15.07) % and 14.06 (18.22) % with and without the phylogenetic correction, respectively; variability expressed as Interquartile Range, IQR).Similarly, the amount of pine species that are predicted to experience an increase in their total suitable area increased from 21% to 42% (median range change across species of −1.16 (17.03) % and −6.56 (17.56) % with and without the phylogenetic correction, respectively; variability expressed as IQR).For instance, models incorporating the phylogenetic correction predicted a loss of suitable area for P. clausa in its native range of Southeast North America at 40%.This represents a decrease of approx.50%, since without the phylogenetic correction this species is predicted to lose 90% of its total suitable area (Figure 2; Appendix S15).Similarly, without the phylogenetic correction, one of the Taiwanese endemic pines (P.morrisonicola) is predicted to suffer a negative range change (−12%, i.e., net loss of total suitable area); in contrast, suitable area is predicted to increase 69% with the phylogenetic correction (Figure 2; Appendix S15).
See Appendix S14 for predictions under climate change for all species and Appendix S15 for a numeric comparison of range loss and change with and without the phylogenetic correction.In addition, Appendix S16 (Table S1 Appendix S16) shows results for the phylogenetic correction without applying the linear scaling.We simply set the phylogenetic suitability to 1 for all cells with an expected climate inside the phylogenetic range (i.e., not considering its relative position respect to the current and ancestral states).As explained in Methods, the linear scaling was applied to reduce the influence of ancestral pine niches closer to the MRCA.Therefore, the comparison of both approaches can give us information about the relevance of these ancient pine niches in our results.We found a great correlation in the predictions of range loss and range change obtained with and without the linear scaling (ρ ≥ 0.996; p < 2.2e-16; Figure S1 Appendix S16).This suggests that our approach is not strongly influenced by very ancient pine niches, given that a reduction of their importance did not change the results.Therefore, the phylogenetic correction is possibly targeting recent pine niches that are more likely to be retained in current fundamental niches and hence could be useful in their response to climate change.While no species facing a significant reduction in its range can be anticipated to overcome this threat based solely on unexplored space of its fundamental niche, this can represent a relevant buffer for certain taxa.

| Predicted variation in pine richness
The patterns influencing pine richness varied across continents and biomes (sensu Olson et al., 2001).Pine species around the Mediterranean Basin are projected to undergo remarkable reductions of suitable habitats, and consequently, species richness is predicted to decrease in this area.Conversely, mountain and temperate forest habitats across Central and Northern Europe are expected to undergo moderate suitability reductions for pines and the number of species (i.e., pine richness) could even increase due to northward shift of pines currently restricted to the South of the region (Figure 3; Appendix S14).Similarly, projections for the Eurasian taiga suggest reduced habitat suitability at the southern edge of the range of Siberian pines, but also the possibility of increased suitability at the north, which could potentially lead to an increase in pine richness at higher latitudes.A substantial reduction in suitable area is predicted for eastern Asian pines in (sub)tropical forests.In contrast, models suggest increases of suitable areas throughout the Tibetan Plateau for pines inhabiting around the Himalayan range.
The projections under climate change for North American pines of temperate forests suggest reductions of suitable habitats.In contrast, models project that conditions by 2070 might facilitate increases in pine richness at higher latitudes for some eastern and western pines, suggesting the possibility of expansion through the Canadian Shield and the Western Coast of Canada, respectively.

At lower latitudes of temperate zones, mountain pines around the
Chihuahuan Desert are projected to experience marked losses in suitable area.Conversely, our models projected lower reductions and even increases of suitable areas for pines of (sub)tropical forests in the South of Mexico and Central America (Figure 3; Appendix S14).
Although general qualitative patterns were not affected by the phylogenetic correction (Appendix S17), this approach had quantitative influence in several regions (Figure 4).For example, it increased the probability of range shift or expansion for some European and Asian pines like P. nigra, P. mugo, P. cembra or P. squamata, which could further contribute to an increase in pine richness in northern latitudes.Similarly, our phylogenetic correction indicated that pines of temperate forests in the southeastern USA such us P. glabra F I G U R E 1 Predicted range loss and change with and without phylogenetic correction.Percentages of range loss and change for all pine species are shown in a right-closed histogram.Each bar shows the number of species with a projected percentage of reduction in habitat suitability (i.e., range loss) and change (i.e., range change) within the corresponding interval range.In both cases, data results are summarized as a proportion of the total suitable area under current conditions.For example, 10% of range loss means that 10% of current suitable area is predicted to be lost for the number of species indicated in the ordinate axis, whilst 10% of range change indicates that the suitable area is predicted to increase 10% relative to its current size for the corresponding number of species.Results with and without considering the phylogenetic correction are shown with bars of different colours (light and dark grey, respectively).and P. elliottii that were predicted to suffer remarkable reductions of habitat suitability could be less negatively affected by climate change.In the same region, P. strobus is predicted to increase its suitability, especially if phylogenetic information is taken into account.
This resulted in higher pine richness in this region compared to the predictions of regular SDMs (Figure 4).

| Independent validation of the models
The independent validation showed that, overall, our models tend to have a good performance when predicting suitability beyond the natural ranges and the environmental conditions considered during training and evaluation.From the 14 species analysed, 13 showed a performance significantly better compared to the random expectation for at least one model (Boyce index significantly above 0; Table S1 Appendix S19).Further, 10 of these species showed a relatively high performance, with a Boyce index significantly above 0.5 for at least one model.Therefore, our SDMs tended to predict naturalized occurrences relatively well for most of the species analysed in the independent validation.Despite the fact that the models without phylogenetic information already exhibited a good performance, the application of the phylogenetic correction significantly improved performance in 5 out 14 species (Table S1 Appendix S19).For 4 of these species (e.g., P. nigra or P. sylvestris), performance slightly improved with increases of the Boyce index ranging from 0.0005 to 0.021 (values for the difference between phylo-and non-phylo Boyce index; note this metric F I G U R E 2 Ensembled projections of suitability by 2070 for P. clausa (upper plots; Southeast North America) and P. morrisonicola (lower plots; Taiwan).In all cases, maps show predictions inside the 12.5° buffer used for estimating range loss and change (areas outside that buffer are shown in black).Colours indicate certainty in suitability projections: White and dark green indicate full agreement, i.e., 0% and 100% of suitability, respectively.Intermediate values (25%-75%), i.e., lighter green, indicate variable suitability predictions across modelling choices.Dotted lines delimit the areas currently occupied by each species.Right panels show suitability predictions after the phylogenetic correction.Note that this correction increased the suitability prediction for certain areas (e.g., North of the Mississippi delta for P. clausa and northern Hainan for P. morrisonicola).See text for details.ranges from −1 to 1).Only one case showed a significant decrease of performance, GAM for P. sylvestris, but the model still exhibited a high performance (>0.9).In the case of P. strobus, performance improved more sharply.For this species, the Boyce index of GAM increased from −0.238 to −0.133, while GLM increased from −0.444 to −0.168.Despite the sharp increases, models for this species still showed a worse performance compared to the random expectation.
The limited, although significant, impact of the phylogenetic correction on model performance could be caused by the conservative approach used, as we limited its application to areas where SDMs showed uncertainty (see Methods).Therefore, we performed an initial exploration of a more liberal implementation of the phylogenetic correction by applying it across the whole study area, independently of SDM uncertainty (see Appendix S19 for further details).Using this approach, the Boyce index for P. sylvestris showed more marked increases ranging from 0.007 to 0.058 (values for the difference between phylo-and non-phylo Boyce index).Note that GAM showed a slight decrease of performance for this species that was completely reverted using the liberal approach.Finally, P. strobus showed dramatic increases of performance ranging from 0.53 to 0.791 (difference between phylo and non-phylo models).As previously stated, the Boyce index ranges from −1 to 1. Therefore, F I G U R E 3 Global patterns of predicted pine richness.Top: pine richness according to SDMs under current conditions.Bottom: differences in predicted pine richness between current and future conditions, resulting from changes in habitat suitability that could ultimately lead to variation in the number of species in each geographic unit (see text for details).
this kind of improvement made poor-performing models (i.e., Boyce index below 0) to perform above the random expectation and even reach a high performance in one case (>0.5).See Appendix S19 for more detailed results of the independent validation.

| DISCUSS ION
We present a global forecast for the distribution of Pinus in the face of climate change.Our projections were established using a novel SDM approach that considers multiple layers of biological and environmental data.We applied a new approach to incorporate evolutionary information into SDMs requiring only presence data and a phylogeny of the group under study.With this pipeline, we predict reductions of suitable areas for 58% of pine species by 2070.The combination of SDMs and evolutionary information modified our predictions about the impact of climate change on pine diversity in some regions, reducing the rate of suitable habitat loss in several cases.These results provide a quantitative insight on pine response to climate change at a global scale in the near future.
According to our models, habitat suitability for the genus Pinus at global scale might be compromised.Range loss is predicted for many species, although these patterns differ among regions and taxa.
Suitability is projected to decrease drastically in areas expected to encounter marked increases of aridity, such as the Mediterranean Basin and southern North America.This might have a negative effect on pine richness in these areas.Conversely, the opposite trend can be important in areas that are expected to undergo increases of temperature coupled with higher humidity.For example, Boreal regions of Eurasia and America, the Himalayan range, and (sub) tropical regions of America.The projected changes might ultimately lead to range shifts and even ecotone displacements, like the southward movement of the ecotone between taiga and tundra in the Taymyr peninsula (northern Russia).These projections are congruent with events of forest mortality and forecasts of global climate and vegetation patterns (Allen et al., 2015;Christensen et al., 2007;Gonzalez et al., 2010;Thuiller et al., 2005;Venevskaia et al., 2013).
Note, however, that suitability does not necessarily correlate with occupation.Whether a species is present in a given location will be ultimately determined by a combination of environmental conditions, dispersal limitations and other factors.
Phylogenetic analyses suggested the existence of climate-niche conservatism in Pinus spp., illustrated by the strong phylogenetic signal and the adequacy of Brownian motion models of niche evolution (Pearman et al., 2008;Wiens et al., 2010).Niche conservatism has been described in a variety of organisms (Wiens et al., 2010;Wiens & Graham, 2005), although it is unclear if it can be assumed to be a general pattern (Pearman et al., 2008).In the particular case of pines, some studies have found support for niche conservatism at the genus level (Jin et al., 2021;Perret et al., 2019).However, other studies have found some evidence of niche divergence, but only when considering infraspecific differentiation or within reduced groups of related pines (Moreno-Letelier et al., 2013;Ortiz-Medrano et al., 2016;Rehfeldt et al., 1999).This sort of discrepancy is not surprising, given that the phylogenetic signal of the niche varies depending on the biological diversity and habitat scale considered (Cavender-Bares et al., 2006;Holt, 2009).In our case, we found instances in which taxa sharing a common ancestor diverged widely ecologically (e.g., boreal P. banksiana and subtropical P. clausa).
Therefore, transgression of the ancestral niche is certainly possible, but the pattern across Pinus remains one of niche conservatism.
It is often posited that niche conservatism might limit the response of species to rapid environmental changes like global warming (Wiens et al., 2010;Wiens & Graham, 2005).However, our results suggest that for several Pinus species, retaining ancestral climatic niches could be beneficial in the face of climate change.For instance, SDMs predict a decrease of pine richness in southern Europe, where summer aridity is already a limiting factor and summer temperature is expected to increase (Boisvenue & Running, 2006;Giorgi & Lionello, 2008).However, this regional decrease in diversity might be F I G U R E 4 Increase of pine richness after applying the phylogenetic correction.This map shows the difference in predicted pine richness under future conditions when the phylogenetic correction is applied.Accounting for the recent evolution of the climatic niche increases the pine richness predicted for some regions (see Figure 3 and text for details).
offset at a continental scale by an increase in species richness in central and northern Europe, where low temperatures now represent a limiting factor for pines, as rising temperatures might enable more species to occupy this area (Boisvenue & Running, 2006;Christensen et al., 2007).The phylogenetic correction provides even more support for high suitability at northern latitudes under climate change.
Species that retain an ancestral niche characterized by cooler conditions might have an advantage at higher latitudes.If they migrate northward, niche conservatism might enable these species to cope successfully with still-cold winter temperatures at higher latitudes and to escape the aridification of lower latitudes.Indeed, the comparisons between native and naturalized niches conducted by Perret et al. (2019) show that pine species tend to occupy colder and wetter habitats in non-native locations, suggesting that pines might be able to cope with colder conditions than those experienced in their extant ranges.Similarly, SDMs predict an increase of pine richness around the Himalayan range, a scenario rendered more likely by the phylogenetic correction.This suggests that the expected increase of temperature in the Tibetan Plateau could facilitate migration, but that this could be more significant for species with colder ancestral niches because they might cope successfully with still-cold habitats (Boisvenue & Running, 2006;Christensen et al., 2007).The phylogenetic correction also increased suitability for pines in the southeastern USA.SDM projections without the phylogenetic correction suggest remarkable range losses for some of these species due to the expected increase of temperature and decrease of precipitation in that region (Christensen et al., 2007).However, several species seem to descend from ancestors that occupied drier environments, thus they might retain part of this ancestral niche and be able to cope with higher aridity than that present in their current range.These results suggest that niche conservatism may not be always detrimental in the face of environmental change.
Although our niche calculations are based on a series of assumptions, it is important to note that the approach used for including evolutionary information into SDMs was conservative overall.First, we limited the range of reconstructed ancestral values used.We only considered the ancestral values between the current and the MRCA state for the phylogenetic range.In other words, we did not account for conditions not encountered in recent evolutionary time that, although less likely, might still be included within the fundamental niche of current species.In addition, we only considered regions that were inside the phylogenetic range of both climatic variables simultaneously, therefore excluding situations in which only one of the two variables fell within the ancestral niche (i.e., of partial overlap).Finally, we only applied the correction in regions showing high uncertainty, leaving out regions with high consistency (certainty) across models and climate scenarios.In other words, we made a deliberate effort to reduce the influence of phylogenetic uncertainty in robust SDM projections.
Admittedly, this approach could have reduced the inferential power of the phylogenetic correction.Therefore, the lack of influence of the phylogenetic correction for multiple species cannot be regarded as support for a non-positive effect (be it deleterious or null) of niche conservatism.Some species with high certainty in their predictions could still see their response to climate change positively influenced by recent evolutionary trends, but our approach is underpowered to detect it.Even so, this approach seems to lead to quite different predictions for specific groups of pines, suggesting that it might provide useful information for species responses to environmental change under certain circumstances.
The consideration of climatic conditions only within the natural range of a species can be seen as a limitation for the development of SDMs.It neglects the possibility that the species can cope with conditions beyond those present in its distribution (Booth, 1991(Booth, , 2023)).Consequently, we assessed the ability of our models to predict suitability beyond natural distributions, using an independent dataset of invasive pine species that includes occurrences across the globe (Perret et al., 2019).We found a relatively good performance of our models for predicting new, naturalized occurrences in most of the species analysed.We could not perform a validation of the whole Pinus genus due to limited data availability (only 14 were included; 12.5% of the genus), but given we were able to obtain reasonable predictions for most of the species considered, it is plausible that our models could be useful to project suitability for many other pine species.Moreover, this independent validation found support (at least partially) for some of the predictions of the phylogenetic correction.This is the case for the increased probability of range expansion in northern and central Europe mediated by the phylogenetic correction.For example, one of the species having this pattern (P.nigra) showed a significant improvement of model performance due to the phylogenetic correction in the validation.Other European species with the same pattern of expansion, P. mugo, showed a tendency of phylogenetic-mediated improvement of performance, but without reaching significance (Table S1 Appendix S19).In the same vein, P. elliottii, a species for which the phylogenetic correction greatly improved the predicted range loss in southeastern USA, showed a tendency of improvement with the correction in the validation analysis, but without reaching significance (Table S1 Appendix S19).One of the cases with the largest increase of future suitability mediated by the phylogenetic correction, i.e., the expansion of P. strobus through southeastern USA, is strongly supported by the marked increases of performance caused by the correction.Although significant, the increases of performance with the phylogenetic information shown by the independent validation were limited for many species.This could be caused by the conservative approach followed to implement the phylogenetic correction (see above).Indeed, we explored an implementation not limited to areas where SDMs show uncertainty and found more marked and even dramatic improvements in performance.We did not further explore this liberal implementation due to the risk of overfitting as we should not use a validation dataset to tuning the models.In any case, these exploratory analyses provided an useful insight about the potential of this approach, pointing it out as a promising tool to improve SDMs (see Appendix S19 for further details).This validation supports that, overall, our models could provide reliable predictions of pine suitability even beyond the climatic conditions present in current ranges.Therefore, these models are potentially relevant to predict the response of pines to climate change