Predicting plant species distributions using climate‐based model ensembles with corresponding measures of congruence and uncertainty

The increasing availability of regional and global climate data presents an opportunity to build better ecological models; however, it is not always clear which climate dataset is most appropriate. The aim of this study was to better understand the impacts that alternative climate datasets have on the modelled distribution of plant species, and to develop systematic approaches to enhancing their use in species distribution models (SDMs).


| INTRODUC TI ON
Species distribution models (SDMs) are among the most widely used tools in ecology. These models link the presence, abundance or vital rates (e.g. growth rate and fecundity) of species to environmental conditions. The conceptual and practical issues surrounding SDMs have been explored from a wide variety of perspectives; however, the underlying uncertainty in climatic predictors is often overlooked (Baker et al., 2016;Baker et al., 2017;Bedia et al., 2013;Jiménez-Valverde et al., 2021;Stoklosa et al., 2015;Suggitt et al., 2017).
Recent decades have seen a rapid increase in the availability of highresolution spatial datasets, often derived from remotely sensed data that characterize environmental conditions at broad scales. While many of these datasets provide new opportunities to directly estimate environmental gradients such as vegetation cover and vertical structure (e.g. Fedrigo et al., 2018;Fedrigo et al., 2019) there have also been ongoing efforts to enhance climate surfaces (e.g. Fick & Hijmans, 2017;Karger et al., 2017;Stewart et al., 2017;Stewart & Nitschke, 2017e) which are critical inputs for a wide range of ecological models.
The development of new methods for representing climatic conditions and weather at fine scales has often been driven by the need to better understand vegetation distributions and ecological systems (Ashcroft & Gollan, 2012;Booth et al., 2014;Hijmans et al., 2005;Karger et al., 2017;Kearney et al., 2014). Advances in computing and spline interpolation techniques (Hutchinson & Bischof, 1983;Wahba & Wendelberger, 1980) were crucial to the development of BIOCLIM in the late 1980s and early 1990s (Booth et al., 2014) and continue to provide effective methods for generating climate surfaces using weather station observations (Fick & Hijmans, 2017;Hijmans et al., 2005;Hutchinson et al., 2009;Jeffrey et al., 2001;Stewart et al., 2017;Stewart & Nitschke, 2017e). The WorldClim datasets (Fick & Hijmans, 2017;Hijmans et al., 2005) represent global climate and bioclimatic indices (see Booth et al., 2014Booth et al., ) between 1950Booth et al., and 2000 at high resolution (approx. 1 km) and have recently been updated to utilize additional covariates and remote sensing variables.
Several studies have investigated alternative climate datasets representing current conditions and their impact on SDMs. These analyses have been conducted at a range of spatial resolutions (50 m to 50 km), with various taxonomic groups, study regions, modelling algorithms and datasets, with and without consideration of climate change scenarios. Climate datasets that better characterize regional temperature and precipitation have been shown to improve predictions relative to WorldClim when modelling the distributions of European beech (Fagus sylvatica) in Northern Iberia (Bedia et al., 2013), plant species in Switzerland (Karger et al., 2017) and Sweden (Meineri & Hylander, 2017), and several bird and fern species in the humid montane forests of Bolivia (Soria-Auza et al., 2010). Bobrowski et al. (2021) found that CHELSA was more effective than WorldClim in predicting the distribution of Himalayan Birch (Betula utilis) in Nepal. In their study of 48 woody plant species in Iberia, Jiménez-Valverde et al. (2021) found that WorldClim and the regional Iberian Climate Atlas produced differences in modelled predictions, statistical performance and variable importance. Distribution models for 451 plant species calibrated using six combinations of global climate data across central Africa, western India and the Amazon basin were found to perform best when land surface temperature or precipitation derived from earth observation data were independently used as model covariates (Deblauwe et al., 2016). Baker et al. (2016) compared six alternative global climate datasets in their study of over 800 African bird species and found that the choice of climate data used in the baseline was at least as important, but could be much more important, as the choice of General Circulation Model (GCM) used for evaluating potential climate change. Lembrechts et al. (2019) showed that distribution models for forbs, graminoids and dwarf shrubs in Scandinavia performed significantly better using insitu soil temperature in comparison to free-air or land-surface temperature. Each of these studies highlight the sensitivity of ecological models to climate dataset selection and demonstrate the need to develop robust approaches to quantifying uncertainty.
Part of the challenge is that there are many potential compounding sources of uncertainty (e.g. input data, algorithm selection and parameterization) that can affect both SDMs and climate surfaces, and few SDM studies have directly considered how uncertainty in climate variables influence prediction. Stoklosa et al. (2015) showed how the statistical performance of SDMs can improve when demonstrate that climate dataset selection can be a comparably significant source of uncertainty.

K E Y W O R D S
Cantelli's inequality, climate, congruence, ensemble, species distribution modelling, uncertainty accounting for the underlying uncertainty in climate variables. While effective, their approach may not be practical in many cases as it requires that uncertainty in climate variables is both well characterized and made available. Furthermore, the ways in which uncertainty in climate surfaces can be represented is dependent upon factors such as algorithm selection, model calibration and observation density.
Morales-Barbero and Vega-Álvarez (2019) proposed a method for characterizing uncertainty in the spatial predictions of species distribution models by fitting multiple global climate datasets in conjunction with a statistical measure of their congruence. Their approach highlights differences between the alternative climate datasets; however, it does not consider the relative importance of climate response variables.
A more general approach to constraining prediction error which drives the success of some machine learning algorithms (e.g. random forests; Breiman, 2001) and which has been commonly applied in climatology (Knutti et al., 2010;Krishnamurti et al., 1999) and SDMs (Kindt, 2018;Marmion et al., 2009;Thuiller et al., 2009), is to calculate the ensemble mean of competing models.
Ensemble predictions will perform best when individual models have low bias and covariance, but high variance (Dormann et al., 2018); however, the potential for signal loss from specific predictions will in some cases favour the use of single well-tuned model over an ensemble (Crimmins et al., 2013;Hao et al., 2020;Knutti et al., 2010). SDM ensembles, commonly developed using alternative algorithms, can be evaluated in several ways (e.g. mean, variability, consensus above or below a threshold, as implemented in the R package biomod2; Thuiller et al., 2020); however, results from studies using these methods are not always presented with sufficient supporting Information to evaluate the benefits of ensemble techniques over individual models (Hao et al., 2019).
Describing ensemble model performance using non-parametric methods such as cross-validation can provide a reliable mechanism for evaluating uncertainty (Dormann et al., 2018), but spatially continuous measures of ensemble behaviour are likely to be particularly useful in the case of climate dataset evaluation, as embedded biases in model covariates may not always be detected in statistical performance metrics (Bedia et al., 2013).
The objectives of this research were to assess the impact of alternative climate datasets on modelled species distributions by comparing the statistical performance and spatial distribution of predictions both individually and as ensembles. It is important to understand how alternative climate baselines influence predictions for two primary reasons. Firstly, there are an increasing number of datasets becoming available and it may not always be clear which dataset is most suitable for a given application. Secondly, ecological models rely heavily on climate variables, and when they are projected into new space such as under climate change scenarios, they adopt the biases present in the selected baseline conditions. Several datasets developed specifically for southeast Australia (Stewart & Nitschke, 2017e) and Bhutan (Stewart et al., 2017; Appendix S1) provide an opportunity to evaluate the influence that alternative climate interpolation techniques have on SDMs and how this compares to global products. These datasets were generated using identical station observations and widely adopted spline-based approaches (full dependence on latitude, longitude and elevation), with additional models having partial dependence on topographic indices to characterize cold air pooling and drainage regimes, and remotely sensed land surface temperature (LST) data that can capture mesoscale temperature gradients. We also included both WorldClim datasets (including the updated version that uses coastal proximity and LST to support the interpolation procedure) and CHELSA, which have been widely used and are well suited to ecological modelling. We first directly compare the statistical performance of models fitted using alternative climate datasets. Our analyses then consider how ensembles compare against individual models. We propose and demonstrate two non-parametric, spatially continuous measures of model congruence and uncertainty which complement and aid the interpretation of ensemble predictions.

| Study area and species data
Plant species distributions were modelled across two study regions; (1) Victoria (140.9°-150.0°E, 34.0°-39.2°S), southeast Australia and Bhutan is a land-locked country located on the southern slopes of the eastern Himalayas. Strong gradients in elevation mean that climatic conditions in Bhutan are highly variable, with elevation varying by over 7 km (100-7500 m asl) across a distance of only 170 km.
All species data used in this study were presence and absence records obtained from systematic field studies within forested and woodland ecosystems. Species records were collated from multiple field campaigns to maximize the pool of available data points, including 1804 sites in Victoria (see Stewart, Elith, et al., 2021) and 349 sites in Bhutan (see Choden et al., 2021). The most prevalent canopy and understory plant species in each study region (38 in Victoria and 12 in Bhutan) were selected for modelling. We focused on longer-lived, woody species, as climate plays a more important role in their turnover than edaphic properties, which vary over finer scales and play a larger role in the distribution of non-woody species with shallower root systems (Kasel et al., 2017). For Victoria, we also included two prevalent tree fern species (Dicksonia antarctica, Cyathea australis) that are associated with rainforest ecosystems and two prevalent grass-tree species (Xanthorrhoea australis and Xanthorrhoea minor) with distinctive climatic niches. Forbs, grasses and small shrubs not capable of reaching 3 m in height were not considered due to the importance of microclimate and small-scale variability in near-surface temperature for these growth forms (Lembrechts et al., 2019). The sampling domain represented a broad range of climatic and topographic gradients across each study region.

| Climate variables
Spatial climate variables were generated using maximum and minimum air temperature (1.2-2 m above ground), and precipitation data from each of the datasets included in the analysis (Table 1). We focused on temperature and precipitation as they were the most commonly available variables across data sources. Both monthly mean (MM) and time-series (TS) data were considered where available.
Summaries of temporal variability (e.g. standard deviation, the observed frequency of extremes) add another dimension to the climatic niche and can improve the predictive performance of SDMs (Karger et al., 2021;Perez-Navarro et al., 2021;Stewart, Elith, et al., 2021;Zimmermann et al., 2009), therefore the TS data were used to provide an additional contrast against the MM products. Three alternative temperature datasets and one precipitation dataset developed specifically for ecological modelling in Victoria (see Fedrigo et al., 2019;Stewart, Elith, et al., 2021;Stewart & Nitschke, 2017e) were included. These were developed at high spatial (approx. 250 m) and temporal (daily, Jan 1981 to Dec 2019) resolution to capture variability in topographically complex landscapes, permitting the calculation of derived variables summarizing the time-series in ecologically relevant ways (e.g. inter-annual distribution of extremes). An analogous set of climate data products were developed for Bhutan to use in this study (see Appendix S1 for methods, statistical performance and comparison against other contemporary products), building on previous interpolations for the region using the local weather station network (Stewart et al., 2017). All interpolations were performed using ANSPLIN 4.4 (Hutchinson & Xu, 2013). We included the two WorldClim datasets (Fick & Hijmans, 2017;Hijmans et al., 2005) given their widespread use in species distribution modelling and global coverage, noting that they only represent long-term mean conditions. CHELSA data representing the period between 1980 and 2018, developed by statistical downscaling the ERA5 reanalysis (Hersbach et al., 2020), was included as an alternative global product capable of representing variability over time. We initially considered the Australian national SILO (Jeffrey et al., 2001) and AWAP (Jones et al., 2009;Raupach et al., 2012) datasets for Victoria; however, they were both excluded because they are comparatively coarse in spatial grain (0.05°) and therefore unlikely to capture fine-scale temperature gradients that occur across the topographically variable regions where many of our study species are found. No comparable fine-grain analogues were available for Bhutan (but see Appendix S1 for a comparison of the regional products with CHELSA and CRU TS 4.05; Harris et al., 2020). Except for CHELSA, each of the datasets selected use thin plate smoothing splines to interpolate climate records, with various combinations of latitude, longitude, elevation, topographic indices, coastal proximity and thermal remote sensing data as either spline variables or independent linear covariates (Table 1).
Each dataset was used to generate a series of climate predictor variables for species distribution modelling ( Figure 1). Monthly precipitation was converted to a quarterly (3 months) rolling sum to characterize cumulative seasonal moisture availability. Annual mean minimum temperature, maximum temperature and quarterly precipitation were calculated for each MM and TS dataset. Maximum temperature of the hottest month (BIO5), minimum temperature of the coldest month (BIO6) and precipitation of the driest quarter (BIO17) were calculated as corresponding measures of seasonality to characterize common climatic stressors that plant communities are exposed to in a typical year. Seasonality variables were calculated annually for the TS datasets. The median was selected as an analogue to the MM seasonality variables, and 1-in-15-year values (i.e. 93.3 rd or 6.6 th percentile) were also included to represent rarer, but more extreme stressors that may influence plant demography.
This recurrence period was selected as it had performed best overall in a previous study of climate extremes in Victoria (Stewart, Elith, et al., 2021). Each of the seasonality and extreme variables were expressed relative to corresponding annual mean and seasonality variables, respectively, to minimize the effects of collinearity. The distribution of climate variables across each presence and absence site and each source dataset are illustrated in Appendix S2.

| Species distribution modelling and performance
Species distribution models were fitted in R with the package biomod2 (Thuiller et al., 2020) using each of the climate variables described above and seven different algorithms: artificial neural networks (ANN), classification tree analysis (CTA), flexible discriminant analysis (FDA), generalized additive models (GAM), gradient boosting models (GBM), multivariate adaptive regression splines (MARS) and random forests (RF). We applied the default parameters for all models, in line with typical usage of biomod2 (Hao et al., 2019). Model performance was evaluated using the AUC, True Skill Statistic, Kappa, and overall accuracy. For brevity, we have focused on discrimination ability with AUC and provided additional calibration and classification statistics in the supplementary materials (Appendix S3). Each were multiplied by 100 (valid range 0-100) to improve readability of the results. A spatially blocked k-fold (approx. 50 × 50 km blocks, 6-10 folds) cross-validation design was used to support conservative estimates of error (Roberts et al., 2017). Spatial blocks were selected using the blockCV R package (Valavi et al., 2019), and each set of folds were held constant across all models for each species to ensure robust comparisons of performance.
Model ensembles were computed as the mean probability of SDMs fitted using alternative combinations of algorithms and climate datasets. Each selected model was assigned an equal weight in the ensemble to avoid the introduction of unknown biases (Claeskens et al., 2016). Three separate sets of SDM ensembles were first calculated: (1) ensembles averaging over each available algorithm, (2) ensembles averaging over each available MM climate dataset and (3) ensembles averaging over each available algorithm and MM climate dataset. These were compared to evaluate the changes in statistical performance that could be achieved using climate-based ensembles as opposed to multiple algorithms, the latter which has been acknowledged as one of the greatest sources of uncertainty in SDM (Thuiller et al., 2019). All subsequent analyses were conducted using an ensemble of all algorithms to control for modelling technique when making comparisons of climate datasets, the key focus of this study. Pairwise comparisons of model performance were calculated for all combinations of SDMs using alternative climate datasets using two-tailed t-tests. Individual model performance was also evaluated against ensemble means of cross-validated model predictions calculated for alternative groups of models (e.g. regional datasets, global datasets, MM and TS variants). Ensemble means were compared against individual models using two-sided pairwise t-tests based on the hypothesis that they would provide an improvement in model performance.

| Model congruence and variability
Multi-model ensembles can contain a large amount of information that is lost when calculating a mean value. Statistical approaches to quantifying potential variability were used in developing two new metrics, proposed to describe contributing model predictions and support the interpretation of ensembles. These metrics were designed using a non-standard application of the threshold that optimally classifies predictions into binary categories (i.e. suitable or unsuitable in SDM).
Predictions are typically only classified into binary surfaces in certain situations due to the loss of information that occurs and therefore these thresholds are not always useful. The proposed metrics instead rely upon the distance from this optimal threshold to characterize contributing model behaviour. The statistical properties of this distance may be used to describe agreement of individual predictions around the threshold value and therefore indicate where the ensemble may be most sensitive to disparate predictions. Both variables are derived using the one-sided Chebyshev-Cantelli inequality (Cantelli, 1911), which is distribution free (e.g. no requirement for normality).
Continuous, pixel-level measures of model congruence and uncertainty were derived for each species as a function of the probabilities predicted by each model in the ensemble.
The optimal classification threshold T was first obtained by calculating the mean threshold that minimized the difference between sensitivity (true positive rate) and specificity (true negative rate) using spatially blocked cross-validated predictions from each of the contributing models. This approach to identifying the optimal threshold was selected based on the recommendations of Jiménez-Valverde and Lobo (2007)  Note: All datasets except for CHELSA are generated using weather station observations interpolated using thin plate spline models. Variables within ƒ() are independent spline variables, followed by linear covariates for partial spline models. When directly compared against one another, there were no significant differences in performance between SDM ensembles compiled using alternative MM climate variables or algorithms.
Highly significant (p < 0.01) differences in AUC were only found for 14 comparisons in Victoria and none in Bhutan (Table 2) when evaluating models fitted with distinct climate datasets against one another. Pairwise comparisons of models fitted with MM climate datasets showed fewer differences. VECD performed significantly better than either WC1 or CHELSA in Victoria and was also a significant improvement upon VTCD and VTCD + M for MM models.
The mean magnitudes of difference in AUC were typically higher in Bhutan; however, they were not significant due to variability in species-specific performance. Models fitted with TS variables also frequently achieved significantly higher AUC than MM models in Victoria. Each of the regionally calibrated datasets (VECD, VTCD and VTCD + M) performed significantly better than CHELSA for TS models.
Ensembles were formed using different combinations of models ( or 1 (suitable), and highlights areas of potential disagreement where the ensemble mean falls near the optimal classification threshold.
The TAI acts like a binary classification but retains numerically continuous information on the level of congruence between competing models (Figures 4c, 5c, 6a, c). The TSD provides a measure of uncertainty which is most sensitive to regions close to the optimal classification threshold (Figures 4d, 5d, 6b, d). The standard deviation is penalized as the STS increases, and therefore acts to filter out regions with the highest level of congruence in modelled predictions.
(1) can be achieved empirically (e.g. Karger et al., 2017;Kearney et al., 2020) or with the support of remote sensing data (Fick & Hijmans, 2017;Stewart & Nitschke, 2017e), the performance characteristics of these datasets does not consistently correspond to better SDMs. This is not to say that better climate data products are not worth pursuit, but that a direct correspondence with improved SDM performance cannot be assumed. The variability of spatial predictions when using alternative contemporary climate datasets that has been previously reported in SDM studies (Baker et al., 2016;Jiménez-Valverde et al., 2021) is consistent with our finding that none of the candidate datasets consistently performed best. In the context of this study, ensembles provide a mechanism to trade-off the inherent uncertainties of different data sources while characterizing the congruence and variability of contributing model predictions.
The improvement in statistical performance achieved with climate-based SDM ensembles was comparable with what was achieved by using different algorithms. This finding is significant, as algorithm selection is known to be a large source of uncertainty for SDMs (Araújo & New, 2007;Elith et al., 2006;Thuiller et al., 2019). Furthermore, the combination of alternative climate datasets and algorithms was significantly better than either approach alone. This suggests that additive gains can be achieved by viewing spatial surfaces (e.g. digital elevation models, land surface temperature and vegetation cover) and any post-processing that is applied (e.g. microclimate modelling). This means that identifying the 'best' dataset can be difficult, and the important differences between products may not be clear when evaluating SDM performance at species sites. These differences can be further compounded when applying models outside of their calibration domain (e.g. climate change scenarios), yet may not be detectable without considering alternative baselines (see Bedia et al., 2013). Such challenges cannot be addressed by algorithm-based ensembles (Thuiller et al., 2020) or instances where a single, well-tuned model is preferred (Hao et al., 2020;Valavi et al., 2021), as they do not consider the underlying uncertainty in spatial covariates. A key advantage of ensemble modelling is that elements of predictive uncertainty can be quantified spatially, highlighting specific areas where results converge or diverge.
We expected regional datasets which were calibrated with the support of MODIS LST (VTCD + M and BTCD + M) to considerably outperform those that only used topography (VECD; VTCD, BECD and BTCD) for interpolation. There was no significant difference in either study region, despite up to 6%-39% and 16%-23% reductions in root mean squared error of cross-validated temperature interpolations across Victoria and Bhutan, respectively, when supported with the use of MODIS LST (Stewart et al., 2017;Stewart & Nitschke, 2017e). This is consistent with Lembrechts et al. (2019), who found that there were no significant differences in plant SDM performance when modelled using air temperature variables from different sources, though used different datasets and were focused predominantly on grass and forb species that responded best to soil temperature. There are several potential reasons for our results, TA B L E 2 Mean pairwise differences in the cross-validated area under the receiver operating characteristic (AUC) across plant species distribution models fitted using discrete monthly mean (MM) or time-series (TS) climate datasets and its sensitivity to soil moisture, limited number of species assessed in Bhutan, available extent of regional data products, and the exclusion of other important environmental gradients that are also be important for defining the niche but can be difficult to accurately quantify without in-situ measurements (e.g. edaphic properties; Bennett et al., 2020).  Zimmermann et al., 2009). Importantly, this improvement was also found when using contrasting methodologies to produce temporal variation. The local interpolations use an anomaly plus climatology technique to generate time-series data (Stewart, Elith, et al., 2021;Stewart & Nitschke, 2017e), whereas CHELSA uses statistical downscaling techniques applied to 3-hourly ERA5 reanalysis data.
Although the monthly mean surfaces are likely less prone to error accumulation as they are often interpolated directly, the ability to characterize interannual variability can improve predictive performance while allowing for more flexibility in the design and selection of climate predictors. Even though the time-series models often performed better than the monthly mean alternatives, further improvements were achieved using ensembles.

F I G U R E 4
Spatial distribution of the ensemble mean (a), ensemble standard deviation (b), threshold agreement index (TAI; c) and threshold-scaled standard deviation (TSD; d) from distribution models fitted for Eucalyptus obliqua using 10 alternative sets of climate variables across Victoria, southeast Australia The proposed TAI and TSD metrics characterize congruence and variability across any number of predictions, supporting the ensemble mean by ensuring that information describing the contributing models is retained. The TAI is computed directly from modelled probabilities rather than the input climate surfaces, meaning that it embeds species-specific responses to variables and can be applied to machine learning models that can fit non-linear trends and variable interactions. The TAI could also be applied in a wide range of analyses where potential niche overlap is of interest, such as the evaluation of climate change scenarios. When congruence is low or the ensemble prediction is close to the optimal classification threshold (i.e. TAI ≅ 0 ), the standard deviation of the predictions is assigned the highest weight for calculating the TSD. This means that the TSD describes regions that have the highest level of predictive uncertainty relative to the optimal classification threshold. The TSD can therefore be indicative of a poorly performing model or strong disagreement among models. This may indicate that additional sampling is required to better characterize the species-response, or that a subset of the models are producing erroneous predictions and should be considered for  & New, 2007;Marmion et al., 2009), or assuming equal weights  to mitigate the potential for introducing unknown biases (Claeskens et al., 2016). The TAI indicates the level of agreement in niche suitability across models, relative to the ensemble threshold. Conversely, the TSD will indicate areas where the largest material differences (i.e. high variability close to the optimal classification threshold) in niche suitability are predicted.
7. Evaluate variability across model predictions using the TSD.
This step will help to identify areas that are indicative of uncertainty due to divergent model predictions and may require additional sampling to improve model calibration. Where F I G U R E 6 Ensemble mean probability relative to the threshold agreement index (TAI) and threshold-scaled standard deviation (TSD) for Eucalyptus obliqua (Victoria, southeast Australia) and Abies densa (Bhutan). Dotted red lines correspond to the threshold which minimizes the cross-validated difference between sensitivity and specificity additional samples can be obtained and included in these regions, return to step 4.
8. Calculate the ensemble mean of model predictions, retaining both the TAI and TSD to support the interpretation of the ensemble.

| CON CLUS ION
The results of this study demonstrate that ensemble modelling is an advantageous strategy when faced with uncertainty in the selection of competing climate datasets for predicting species distributions.
The ongoing development of new and better techniques for generating, downscaling, and modifying climate surfaces, and need to evaluate the response of ecosystems to projected climate change, means that these uncertainties are likely to become even greater over time.
The procedure described can minimize the risk of unknowingly se-

ACK N OWLED G EM ENTS
We would like to thank the National Center for Hydrology and Meteorology (NCHM), Ministry of Economic Affairs, Bhutan for providing weather station observations that were essential in developing the climate surfaces used in this study. We thank Jane Elith from the University of Melbourne, Chris Ware and Sam Andrew from the CSIRO, and each of the anonymous reviewers for providing valuable and constructive feedback.

CO N FLI C T O F I NTE R E S T
The authors have no relevant financial or non-financial interests to disclose.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13515.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets developed as part of this study are available on the CSIRO Data Access Portal (https://data.csiro.au). See Stewart et al. (2021a) for the climate variables, calibration records, modelled species distributions, ensembles, metrics and code used to calculate the TAI and TSD. See Stewart et al. (2021b) for the monthly and monthly mean climate datasets interpolated for Bhutan.