Ecological and biological indicators of the accuracy of species distribution models: lessons from European bryophytes

The predictive power of species distribution models (SDMs) varies substantially among species depending on their ecological and life-history traits, but which of these traits are the most relevant and how they influence species ‘predictability’ remains an area of debate. Here, we address these questions in bryophytes. SDMs employing macroclimatic, topographic and edaphic predictors were calibrated for 411 species in Europe and externally evaluated using an independent dataset. Regression models were implemented to determine whether species characteristics, including life-history traits, ecological preference and niche breadth, determine the accuracy of SDMs. Variation in SDM accuracy among species was significantly explained by species characteristics, supporting the hypothesis that the strength of species–environment correlations is affected by characteristics of the species themselves. The percent variance of SDM accuracy explained by species traits, however, substantially varied between 9 and 57% depending on the evaluation metrics used. The lower correlation observed between species traits and MaxKappa and the Boyce index than with area under the curve (AUC) and MaxTSS suggests that the former are less suitable than the latter for determining species ‘predictability’ based on their traits. SDM accuracy decreased from species restricted to pristine habitats to species thriving in eutrophic habitats with high levels of human disturbance. The widespread distribution of man-made habitats in fact opens the door for the spread of now ubiquitous species, even in environments that would primarily not be suitable for them. Such species, likely to occur anywhere, reach very high to full occupancy rates, thereby decreasing the accuracy of models aiming at predicting their distributions. The fact that AUC and MaxTSS were higher for species from pristine habitats is important in a conservation context, as ubiquitous species from eutrophic, disturbed environments are precisely the ones of lower conservation relevance


Introduction
Species distribution models (SDMs) have been intensively used for assessing species niches, projecting them onto past, present or future environmental conditions and making predictions of suitable environments for species in space and time , Araújo et al. 2019). Although SDMs have had many successful applications in ecology, evolutionary biology and conservation biology, their accuracy and transferability vary depending on species characteristics , Guisan et al. 2007, Poyry et al. 2008, Hanspach et al. 2010, Yates et al. 2018, McCune et al. 2020, Tessarolo et al. 2021. Several ecological traits are known to affect SDM results, including body size (França andCabral 2016, Zamorano et al. 2019), life span (Hanspach et al. 2010, McCune et al. 2020, growth rate and successional status (Guisan et al. 2007), habitat specialization (Guisan and Hofer 2003, Marshall et al. 2015, Regos et al. 2019 or dispersal ability (McCune et al. 2020). Whether such traits positively or negatively impact model accuracy remains, however, an area of debate. For instance, conflicting assessments on the role of niche breadth and dispersal capacity on SDM accuracy have been reported (Guisan and Hofer 2003, Seoane et al. 2005, Hernandez et al. 2006, Guisan et al. 2007, McPherson and Jetz 2007, Poyry et al. 2008, Newbold et al. 2009, Marshall et al. 2015, Connor et al. 2018, Matutini et al. 2021, Tessarolo et al. 2021. Although good dispersers have the ability to successfully fill their niche, leading to a low proportion of false positives, and hence high model specificity, poor SDM performance was reported in species with higher mobility, broader niche breadth and wide distribution ranges (Hortal et al. 2008, Newbold et al. 2009, Tessarolo et al. 2014, Guo et al. 2015. Altogether, our ability to accurately capture species niches depends on combinations of traits, which may correlate positively or negatively with model accuracy. For example, a species might be a good disperser filling its niche, leading potentially to a low proportion of false positives, and hence to high model specificity; but it might also be a generalist, whose niche may be more difficult to characterize (McCune et al. 2020).
Investigating how trait variation across species impacts our capacity to effectively model species distributions involves a reliable assessment of SDM accuracy, raising three issues.
First, although the performance of different evaluation metrics to measure SDM accuracy has been assessed (Hirzel et al. 2006, Norberg et al. 2019, Jimenez and Soberón 2020, the impact of the choice of a given statistic on the correlation with species traits has been less investigated. McCune et al. (2020) and Tessarolo et al. (2021) reported a substantial variation of the correlation coefficient of different SDM evaluation metrics and species traits, as well as large differences in the best combinations of species traits used to predict SDM accuracy. How to interpret such differences, and which SDM evaluation metrics can be recommended when searching for the best combination of species traits to predict SDM accuracy remains, however, an open question. Second, the most common approach of model evaluation involves the partitioning of the data, calibrating the model with part of it and evaluating it against the remainder (Fielding and Bell 1997). A problem with data-partitioning approaches is that if the same bias in the species data is present in all partitions, then the model may be biased and the estimate of model accuracy inflated (Chatfield 1995, Newbold et al. 2010, Petitpierre et al. 2017, Lou et al. 2019, Mothes et al. 2019). Based on a review of studies implementing SDM evaluation against independent datapoints, Lee-Yaw et al. (2022) concluded that cross-validated area under the curve (AUC) is an unreliable indicator of model performance, and confirmed previous recommendations about external model validation with independent data (James et al. 2013.
Third, while all the ecologically meaningful variables should theoretically be integrated in SDMs (Austin and Van Niels 2011), what is known about the ecological requirements of species is, in practice, not necessarily what is used for modelling (Mod et al. 2016, Scherrer and. This bias in variable selection leads to the omission of crucial information that could impact SDM predictions. In fact, most SDM studies use only bioclimatic predictors (Araújo and Peterson 2012). A review of SDMs in plants at a scale up to 1 km 2 showed that approximately one-third of the studies included only climatic variables, a trend that increases at coarser resolutions and over recent time (Mod et al. 2016). Thus, although many studies have stressed the importance of environmental coverage (i.e. covering all dimensions of species niches) to obtain accurate models of species distributions (Thuiller et al. 2004, Hortal et al. 2008, Lobo et al. 2010, its effects on SDM performance have seldom been evaluated (Chauvier et al. 2021a, De Castro Oliveira et al. 2021, Tessarolo et al. 2021.
The main goal of the present study was to assess whether the accuracy of SDMs varies depending on species ecological and biological traits, taking European bryophytes as a case study. We first assessed the accuracy of SDMs calibrated from available online bryophyte distribution data in Europe and climatic predictors using an independent test set. We further determined whether SDM accuracy can be improved using additional, easily derived environmental predictors. Finally, we assessed whether species characteristics, including life-history traits, ecological preference and niche breadth, play a significant role in the variation of SDM accuracy, and whether the choice of different SDM evaluation metrics affects this relationship.

Species distribution data and environmental predictors
Species selection was based on the independent test set used to evaluate the models calibrated at the European scale. This test set consists in an atlas of distribution of 662 bryophyte species in southern Belgium (Sotiaux and Vanderpoorten 2015, updated in www.biogeonet.ulg.ac.be). This test set was selected because it was assembled from a systematic survey of all of the 1182 4 × 4 km grid-squares in southern Belgium. Each grid-square was visited at least twice between 1980 and 2014, each time during a full day, by the same team of recorders. Thus, although further floristic work would (and does) yield new records, the systematic survey, on which this atlas is based, provides a reliable framework to record both species presences and absences. The study was based on Jimenez-Valverde (2020), who recommended that at least 15 occurrence points are needed for model evaluation from presence-absence data based on AUC, and so we only retained species with a total of > 15 occurrences in the independent test set, decreasing the number of species to 438.
For each of these 438 species, distribution data in Europe (excluding southern Belgium) were automatically downloaded from GBIF.org (https://doi.org/10.15468/dl.7tf2jy; https:// doi.org/10.15468/dl.texnhk; https://doi.org/10.15468/ dl.2q5ea4) via the 'rgbif ' package (Chamberlain et al. 2021) in R ver. 4.0.2 (www.r-project.org). Because of the low accuracy that often characterizes old records, only coordinates recorded from 1980 onwards were kept. We selected 1980 as a limit because data from the test set in southern Belgium started to accumulate in 1980 and because that date corresponds to the beginning of 'present time' in climatic maps. Data with a location accuracy > 1 km were also filtered-out. To avoid pseudo-replication, species presences were mapped on a 30-arc-second-resolution grid, and a single occurrence was kept per 30-arc-second pixel using the 'dismo' package (Hijmans et al. 2021). Finally, only species with ≥ 100 occurrences for model calibration, as recommended in Fernandes et al. (2019), were kept. This resulted in a final dataset of 411 species, with the number of occurrences for model calibration ranging between 100 and 40 332.
We downloaded 19 bioclimatic variables, as well as relative humidity, solar radiation, evapotranspiration (whose annual maximum, minimum, mean and range were computed using the 'raster' package (Hijmans et al. 2022)) and 13 bioclim+ variables from CHELSA ver. 1.2 (Karger et al. 2017(Karger et al. , 2018 at 30-arc-second resolution. We considered another 30 environmental variables based on previous analyses of the factors driving bryophyte species distributions at regional scales, emphasizing the primary role of topography (Vanderpoorten et al. 2005, Åström et al. 2007, Hespanhol et al. 2010 and references therein), soil pH and nitrogen content (Bates 2000). Fifteen topographic variables, which were derived from 250-m resolution digital elevation maps of the GMTED2010 (Danielson and Gesch 2011), were downloaded at 30-arc-second resolution from Amatulli et al. (2018). pH and nitrogen content of the upper soil layer (0-5 cm) were downloaded from SoilGrids.org at a resolution of ~ 250 m and averaged over 30-arc-second pixels with the 'raster' package (Hijmans et al. 2022). Thirteen other soil variables were also downloaded at 1 km resolution from the European Soil Data Centre (ESDAC, esdac.jrc.ec.europa. eu) (Panagos et al. 2012, Hiederer 2013a. See Supporting information for all preselected variables. To select the variables used for modelling, and to avoid multicollinearity, we assessed the correlation among each pair of variables. We randomly selected 250 000 background points across Europe (extent: xmin = −25°, xmax = 45°, ymin: 32°, ymax = 84°; in the projection system EPSG: 4326), extracted environmental values at those locations, and computed Pearson's correlation coefficients among all variables. To avoid multicollinearity, we kept only one variable in a pair with a correlation coefficient > 0.65 (a little more stringent than the value of 0.7 recommended by Dormann et al. 2013), leaving five climatic, two topographic and four soil variables (Table 1). We focused on variables expected to play a major role in bryophyte species distributions, based on our knowledge of their ecophysiology. For example, slope and northness were selected, as bryophytes tend to exhibit higher species richness on north-facing slopes owing to their poikilohydric condition (Åström et al. 2007). Although a specific variable selection could have been performed for each species individually (Austin and Van Niel 2011), the same set of predictors was employed here to enable the comparison of model performance among species.

Species distribution models
SDMs were calibrated for each species using two different sets of predictors: climate-only, which contains the five selected climatic variables (hereafter called 'climate-only models'); and all types of variables, including the 11 selected variables (hereafter 'all-variable models'; Table 1). They were generated from the occurrence data at the European scale in order to capture  Amatulli et al. (2018) the entire species niche and avoid issues of niche truncation encountered in local models (Chevalier et al. 2021, 2022, Scherrer et al. 2021. We randomly sampled, for each species, 10 000 background points across Europe via the 'sp' package Bivand 2005, Bivand et al. 2013). We generated SDMs with the R package 'biomod2' (www.r-project. org, Thuiller et al. 2021) using four algorithms: generalised linear models (GLM; Nelder and Wedderburn 1972), generalized additive models (GAM; Hastie and Tibshirani 1987), gradient boosting machine (GBM; Friedman 2001) and MaxEnt (Phillips et al. 2022) using the default parameters. Prevalence in the models was set to 0.5, so that background and occurrence points were equally weighted. To evaluate the models internally, ten replicates were run using 70% of the data to calibrate the models and 30% to evaluate them using the AUC, specificity, sensitivity, maximum of the true skill statistic (MaxTSS), maximum of kappa (MaxKappa) and the Boyce index . We computed ensemble models (Araújo and New 2007), wherein all individual models with a MaxTSS ≤ 0 were removed, and wherein the other models contributed proportionally to their MaxTSS.
To evaluate the models externally, we first assessed whether the training and test sets experienced analogous climates using a multivariate analysis based on Mahalanobis distances (Mesgaran et al. 2014) with the function ecospat.climan from the 'ecospat' package (Broennimann et al. 2022). This method estimates the similarity between the European (reference area) and southern Belgian (projected area) domains by considering the deviation from the mean and the correlation between environmental variables, giving for each pixel at 30-arc-second resolution in southern Belgium a value between ± infinite. A value ranging from 0 to 1 corresponds to an analogous climate, while values below 0 correspond to novel conditions at the univariate level, and above 1 to novel conditions at the multivariate level (Mesgaran et al. 2014). The analysis revealed totally analogous conditions between Europe and southern Belgium, with pixel values ranging from 0.01 to 0.28 (Fig. 1). The ensemble models were then projected in southern Belgium at 30-arc-second resolution.
To determine whether a species is likely to occur within a 4 × 4 km pixel, we took the maximum suitability value across its 30-arc-second constitutive pixels. A weighted mean of the suitability values across its constituting 30-arc-second pixels was also applied, with weights corresponding to the fraction of each 30-arc-second pixel inside the 4 × 4 km pixel. Specificity and sensitivity were computed using the threshold that maximises the TSS in the southern Belgian dataset using the 'ecospat' package (Broennimann et al. 2022). AUC was computed via the 'dismo' package (Hijmans et al. 2021) and the Boyce index, MaxTSS and MaxKappa via the 'ecospat' package (Broennimann et al. 2022).

Variation in model accuracy depending on species traits
A series of ecological traits describing ecological preferences, niche breadth and life-history traits, derived from van Zuijlen et al. (2023), were tested to determine whether they could help in interpreting variation in model accuracy among species. Ecological preference variables included Ellenberg's values for light (L = 1 deep shade to 9 full light), moisture (F = 1 extreme dryness to 9 wet-site indicator), pH (R = 1 extreme acidity to 9 high pH) and nitrogen (N = 1 extremely oligotrophic to 9 extremely eutrophic). Although species ecological preferences may vary from one area to another due to, for instance, ecotypic differentiation or competition, Ellenberg's indicator values have been generally proven to be robust (Diekmann 2003, Carpenter and Goodenough 2014, Descombes et al. 2020. They remain widely used to characterize the habitat requirements of bryophyte species (Pakeman et al. 2022) which, in contrast to the vast majority of seed plants, do not tend to develop ecotypes but rather display an inherent broad ability to cope with environmental variation (reviewed by Patiño et al. 2014), and wherein the role of competition has been questioned (Rydin 2009). We also added an index of 'hemeroby' (Hill et al. 2002), which characterizes whether species occur in pristine habitats or in habitats with high levels of human disturbance (nine classes ranging between human impact absent to very strong).
Life-history traits included average shoot length, sexual condition (monoicous, dioicous, both), sporophyte frequency (four classes from absent in Europe to frequent), mean spore diameter, capacity of producing gemmae (absent and present); capacity of producing rhizoidal tubers (absent and present) and life strategy. The latter describes the extent to which species are characterized by traits involved in efficient dispersal capacities versus traits promoting long persistence and local adaptation, from species characterized by early reproduction, short life expectancy and high dispersal capacities to late reproduction, long life expectancy and poor dispersal capacities. Ten life-history categories were considered: 1) annual shuttle; 2) colonist; 3) ephemeral colonist; 4) pioneer colonist; 5) fugitive; 6) long-lived shuttle; 7) perennial; 8) competitive perennial; 9) stress-tolerant perennial; and 10) short-lived shuttle (During 1992). We also included the factor 'taxon' (moss versus liverwort) to determine whether moss-and liverwort-specific traits could impact model accuracy.
To determine whether such traits determine the accuracy of SDMs, linear models were performed for each evaluation metric using all the above predictors and testing polynomial effects for all numerical variables. Variable selection was implemented using the lasso procedure (alpha = 1; Tibshirani 1996, Tibshirani et al. 2012 for each model to select the best structure via the 'glmnet' package (Friedman et al. 2010). To measure the importance of each of the selected variables in the model, each variable was successively randomized 100 times, generating 100 shuffled datasets with biomod2 (Thuiller et al. 2021). Evaluation metrics were computed for each shuffled dataset and correlated with the evaluation metrics obtained from the initial model. The importance of a variable in a model was computed as the average of 1 minus the correlation coefficient across the 100 replicates. To make comparisons among variables of a same model, importance values were rescaled between 0 and 1, dividing the importance values by the maximal value.

Results
Based on the internal evaluation of the dataset used to calibrate the models (European data), climate-only ensemble models exhibited high median AUC, MaxKappa, MaxTSS, Boyce index, sensitivity and specificity across species of 0.98, 0.80, 0.87, 0.98, 0.95 and 0.93, respectively (Fig. 2). The same trends were observed with individual models (Supporting information). These statistics dropped significantly and substantially (median AUC, MaxKappa, MaxTSS, Boyce index, sensitivity and specificity across species of 0.62, 0.11, 0.27, 0.38, 0.79 and 0.60, respectively) when the models were evaluated externally with the independent test set (southern Belgium), taking the maximum habitat suitability value of the 30-arc-second pixels that compose the 4 × 4 km pixels used for floristic recording in southern Belgium (Fig. 2). While all-variable models performed similarly to climate-only models when they were internally evaluated, the performances of all-variable models were significantly better using external evaluation, with median AUC, MaxKappa, MaxTSS, Boyce index, sensitivity and specificity of the ensemble models across species reaching 0.71, 0.18, 0.37 and 0.75, 0.83 and 0.65, respectively (Fig. 2, see Supporting information for the accuracy of individual models). Similar results were obtained when the suitability of the 4 × 4 km pixels was derived from a weighted mean of suitability values across their constitutive Figure 1. Environmental analogy between the area used to calibrate (Europe, in black) and to evaluate (southern Belgium, in red) species distribution models in bryophytes. Bio1: Annual mean temperature. Bio7: Annual range of temperature. Bio12: Annual amount of precipitation. RH: Mean relative humidity of the driest month. Nitrogen: Soil nitrogen content at 0-5 cm depth. pH: Soil pH at 0-5 cm depth. srad: Annual range of solar radiation. Gravel: Coarse fragment content in the topsoil (0-30 cm). TAWC: Total available water content from pedotransfer rules in the topsoil (0-30 cm).
The 'predictability' of SDM accuracy as a function of species traits was globally higher for all-variable models than for climate-only models, and substantially varied among accuracy metrics. Thus, adjusted R 2 values were 0.56, 0.56, 0.43, 0.23, 0.00 and 0.28 for all-variable models, but dropped to 0.41, 0.42, 0.46, 0.20, 0.00, and 0.23 for climate-only models for MaxTSS, AUC, MaxKappa, Boyce index, sensitivity, and specificity, respectively (Supporting information). The slope of the selected species traits and their importance in the models are presented in Fig. 3 (see Supporting information for values). In climate-only models, only the pH index was selected for all the accuracy metrics. The N index, and the index of 'hemeroby', which characterizes whether species occur in pristine habitats or in habitats with high levels of human disturbance, were also included in the case of MaxTSS.
In all-variable models, the predictors of the variation in MaxTSS, AUC, MaxKappa, Boyce index, specificity and sensitivity, and their contribution to the model, substantially varied among the six metrics used to assess SDM accuracy. The pH index was the trait that had the highest importance for all accuracy metrics (except for sensitivity, which was explained by no trait), and had a negative slope. The second-best predictor, also with a negative slope, was the index of 'hemeroby'. It was selected in models for AUC, MaxTSS, MaxKappa and the Boyce index, but was more important in models for the two former than the two latter metrics. MaxKappa was the metric correlated with the highest number of species traits, but each trait individually (except the pH index) had a relatively low importance in the model. MaxKappa was also the only metric correlated with the number of occurrences used to calibrate the model, with a slope initially increasing with the number of occurrences but decreasing when prevalence was higher than 0.68. Traits associated with dispersal capacities were selected for only two accuracy metrics, namely Figure 2. Comparison of the accuracy (AUC, Boyce index, MaxKappa, MaxTSS, specificity and sensitivity) of species distribution models using only climatic predictors (Only climate) and using all types of predictors (All variables) for 411 bryophyte species calibrated from GBIF data in Europe internally evaluated using the same data after partitioning (70% for calibration and 30% for evaluation) and externally evaluated using an independent test set (southern Belgium), taking the maximum habitat suitability value of the 30-arc-second pixels inside the 4 km pixel. AUC, area under the curve. tuber production for specificity and sexual condition for MaxKappa, with a low importance in the models in both cases. Life strategy was also selected only for MaxKappa, but with a low importance in that model. Traits associated with niche breadth were only selected for one accuracy metric, namely the number of substrates for MaxKappa, but with a very low importance.

Internal versus external model validation
A substantial drop in predictive power was observed when validating the models using external instead of internal test data (from a median of 0.98 to 0.62, 0.80 to 0.11, 0.87 to 0.27, 0.98 to 0.38, 0.95 to 0.79 and 0.93 to 0.60 for AUC, MaxKappa, MaxTSS, Boyce index, sensitivity and specificity, respectively). This drop is consistent with recent studies raising concerns about the use of internal model cross-validation (Yates et al. 2018, Lee-Yaw et al. 2022 and references therein) and is even globally higher than the average drop of 22 and 13% between internally and externally validated models, respectively, reported by Newbold et al. (2010) and computed from McCune (2016). In a review of studies using external model validation, 64% of studies reported AUC > 0.70 for at least 75% of species considered after external validation (Lee-Yaw et al. 2022).
External model validation using independent data provides a less biased assessment of model accuracy , except if the link between species and the environment changes across space (Roberts et al. 2017, Santini et al. 2021, which can happen if the independent evaluation dataset comes from a region with a non-analogous environment McGill 2013, Petitpierre et al. 2017). In the present study, the test set was geographically deeply nested within the training set, and a multivariate environmental similarity surface analysis (Elith et al. 2010) confirmed the complete analogy between the environments of the two sets.
We suggest that the extent of the drop of model accuracy between internal and external validation depends on the presence of a bias in the data (Yates et al. 2018). Studies Figure 3. Importance of species traits selected in the models predicting the accuracy (AUC, Boyce index, MaxKappa, MaxTSS, sensitivity and specificity) of bryophyte species distribution models in southern Belgium using macroclimatic predictors only ('Only climate') and using macroclimatic, soil and topographic predictors ('All variables'). Colour gradient ranges from black, the most important variable, to light yellow, the least important variable. '+', '−' and '±' in the boxes represent positive, negative and hump-shaped relationships, respectively, between traits and accuracy metrics. Tubers: ability to produce rhizoidal tubers; Gem: ability to produce gemmae; Taxon: Liverworts versus mosses; N substrate: number of substrates occupied; Spore diameter: average spore diameter; Size: average shoot length; Sex: sexual condition; NOcc: Number of occurrences; Life strategy: species life strategy; L, F, R, N: Ellenberg index for light, humidity, pH and nitrogen; Hemeroby: level of human disturbance; N habitat: number of habitats occupied; AUC, area under the curve.
involving data evenly sampled for both model training and evaluation within the same area reported robust evaluation metrics (West et al. 2016) or even no significant difference between internal and external validation (Lou et al. 2019). Nevertheless, substantial drops in model accuracy, comparable to those reported here, were reported when models are calibrated from data sampled from databases and evaluated using other independent data , Beck et al. 2014. In fact, all distributional databases are spatially biased due to uneven effort of sampling, data storage and mobilization (Graham et al. 2004, Meyer et al. 2015, Anderson et al. 2016. Such bias is particularly pronounced in GBIF, where nationwide differences in funding and data sharing lead to huge differences in contribution to GBIF (Brown et al. 2020, Boyd et al. 2021, Zizka et al. 2021. Methods, such as spatial or environmental disaggregation of the data by subsampling Guisan 2006, Chauvier et al. 2021b) as implemented here, or the use of a target-group background sampling that targets the probability of background environmental samples to regions that have actually been wellsampled (Phillips et al. 2009, Barber et al. 2022, have been developed to address the issue of biased sampling. In the presence of a strong bias, however, these methods do not prevent misleading internal model evaluation as compared to external validation, so that large differences in model accuracy occur when models are evaluated internally and externally , Beck et al. 2014. In this case, a robust external validation by spatially independent samples is required (Santini et al. 2021).

Contribution of non-climatic abiotic predictors to species distribution models
The use of additional variables of soil and topographic conditions complementing climatic predictors allowed for an increase of median model accuracy of 15, 64, 37 and 97% for AUC, MaxKappa, MaxTSS and Boyce index, respectively. Bryophyte species distributions are in fact primarily driven at the landscape scale by gradients of pH reflecting major geological variations of bedrock types (Bates 2000). Northness also appears as a key factor in poikilohydric organisms like bryophytes, whose preference for constantly humid habitats explains their preference for northern exposures; while topography, which reflects habitat heterogeneity, is among the main drivers of species richness (Vanderpoorten et al. 2005). Inclusion of such factors therefore increased model accuracy, as suggested in previous studies (Lassueur et al. 2006). However, the increase in accuracy reported here is substantially larger than the marginal increase of Kappa from 0.74 to 0.75 reported by De Castro Oliveira et al. (2021) and of AUC and MaxTSS from 0.82 to 0.85 and 0.55 to 0.58, respectively, by Hageer et al. (2017) after the inclusion of soil data in vascular plant studies at a similar resolution (about 5 km) to the present study. The accuracy increase reported here is more similar to that reported at much finer resolution (25 m), for example a MaxTSS increase ranging from 45 to 56% among vascular plant species, thanks to the addition of soil water-holding capacity (Cianfrani et al. 2019) or a median MaxKappa increase of 11% after the addition of a combination of soil features (Buri et al. 2020).
Recent studies evidenced the crucial role of additional variables to climatic data for modelling bryophyte species distribution and richness at a similar resolution to the present study. For instance, in models of bryophyte species richness, vegetation (as derived from remote sensing) and climatic variables appeared among the most important predictors, with a similar contribution (Cerrejón et al. 2020). Climatic variation had a two-to threefold lower contribution than land-use variables in models predicting the distribution of two invasive moss species (Dyderski et al. 2022). Altogether, these results emphasize the crucial role that variables other than climatic ones may play in bryophyte species distribution models at regional scales. In the context of increasing availability of environmental databases (Tóth et al. 2017, Amatulli et al. 2018, Poggio et al. 2021, we therefore suggest, together with Mod et al. (2016) and Scherrer et al. (2019), that a choice of environmental predictors beyond commonly used climatic variables can be easily tested and likely result in substantial improvements in model accuracy.

Variation of the accuracy of species distribution models as a function of species traits
SDM accuracy varied substantially among species. In all-variable models, AUC, MaxTSS, MaxKappa, Boyce index, sensitivity and specificity ranged between 0.30-0.98, 0.00-0.74, 0.00-0.94, −1.00-1.00, 0.00-1.00 and 0.00-1.00, respectively. This variation was significantly explained by species traits, supporting the hypothesis that the strength of speciesenvironment correlations is affected by the characteristics of the species themselves (Guisan et al. 2007, McCune et al. 2020. In this context, we interpret the lack of relationship between model accuracy and rare bryophyte species traits previously reported (Cerrejón et al. 2022) in terms of the low number of occurrences available to accurately assess model performance as well as the small number of traits scored. In particular, key traits such as Ellenberg values for species ecological preferences, which were among the best predictors of model accuracy in the present study (below), are currently available for European species only and should be developed elsewhere.
However, and as McCune et al. (2020) and Tessarolo et al. (2021) already noticed, the correlation between SDM accuracy and species traits, and the combinations of traits involved, vary substantially depending on the evaluation metrics used. Higher explained variance between SDM accuracy and species traits were obtained with AUC and MaxTSS than with MaxKappa, followed by the Boyce index, specificity and sensitivity.
The utmost importance of the number of occurrences used for SDM calibration in the models for MaxKappa reported here is in line with the high dependency of this statistic on prevalence (Allouche et al. 2006). In agreement with Meynard and Quinn (2007) and Sor et al. (2017), but in contrast with Syfert et al. (2013) and van Proosdij et al. (2016), MaxKappa increased with the number of occurrences used to calibrate models, but decreased when prevalence was higher than 0.68. Although variation in MaxKappa was correlated with many traits, this correlation was substantially lower than with AUC and MaxTSS, suggesting that the former is less suitable than AUC and MaxTSS for determining species 'predictability' based on their traits.
The Boyce index, which was specifically designed for presence-only data (Hirzel et al. 2006), also returned a lower correlation with variation in species traits than AUC and MaxTSS. We tentatively interpret this in terms of the information carried by absence data and their correlation with species traits. Our observations thus suggest that establishing the relationship between species 'predictability' and traits requires presence-absence data, undermining the utility of the Boyce index in this context and pointing to the use of statistics taking advantage of true absence data, such as AUC and MaxTSS.
Species ecological preferences, and in particular the pH and 'hemeroby' indices, as well as -to a lesser extent -the N index, were the variables contributing most to the model linking SDM accuracy and species traits. SDM accuracy decreased from species characteristic for acidic and oligotrophic substrates to species preferring basic and eutrophic substrates, pointing at first sight to the role of edaphic specialization for species 'predictability'. This would contrast with the findings of McCune et al. (2020), who found no impact of edaphic specialization on SDM accuracy in spermatophytes. In fact, in a relatively densely populated area such as southern Belgium, exposed to fast-increasing eutrophication due to atmospheric deposition of nitrogen and direct inputs for agriculture, the distribution of species of acidic and oligotrophic habitats tends to be more restricted than that of a more widespread species thriving in eutrophic environments such as Brachythecium rutabulum. Similarly, the widespread distribution of manmade, basic habitats (e.g. concrete) opens the door for the spread of now ubiquitous calcicolous species such as Tortula muralis, even in environments that would primarily not be suitable for them. Such species, likely to occur anywhere, reach very high to full occupancy rates, thereby decreasing the accuracy of models aiming at predicting their distributions. This is why, in line with previous reports for species with wide distribution ranges (Hortal et al. 2008, Newbold et al. 2009, Tessarolo et al. 2014, Guo et al. 2015, SDM accuracy decreased from species restricted to pristine habitats to species thriving in habitats with high levels of human disturbance. The fact that AUC and MaxTSS values were higher for species from pristine habitats is important in a conservation context, as ubiquitous species from eutrophic, disturbed environments are precisely those of lower conservation relevance. Dispersal traits were selected as predictors of MaxKappa only. The results suggested that MaxKappa was higher in monoicous than in dioicous species, and in species with a short life-span and a high investment in sexual reproduction than in long-lived, often sterile species, thus pointing to better SDMs in species with high dispersal capacities. This matches predictions, confirmed in spermatophytes (McEuen andCurran 2004, McCune et al. 2020), according to which species with high dispersal capacities should be able to quickly colonize suitable habitats. In this way they exhibit distributions that more closely match environmental conditions, and decrease the commission rates, as compared to species with more limited dispersal capacities that fail to occupy all potentially suitable habitats (McCune et al. 2020). Nevertheless, dispersal traits globally played a minimal role in explaining SDM accuracy in bryophytes. We suggest that two mechanisms may explain why this is the case. First, most bryophyte species intrinsically exhibit high dispersal capacities, enabling them to colonize habitats as soon as they become available. Within a few decades, newly available substrates such as slag heaps are eventually colonized by a range of bryophyte species, even those characterized by low sporophyte and gemmae production (Hutsemékers et al. 2008, but see Zanatta et al. 2020). Second, the idea that traits such as the sexual condition -which itself drives sporophyte frequency -actually impact species dispersal capacities, and hence distribution ranges, has been questioned (Laenen et al. 2016).
In conclusion, our results show that SDM accuracy can be significantly predicted from species traits in bryophytes. The percent variance of approximately 56% of SDM accuracy explained by species traits was, however, somewhat lower than that reported in previous studies (e.g. 58% in plants, Guisan et al. 2007, 69% in beetles, Tessarolo et al. 2021, 88% in spermatophytes, McCune et al. 2020). While such a difference may reflect the global characteristics of the group investigated, and in particular their dispersal capacity, they may also point to the need to acquire additional information on species traits in bryophytes. Bryophyte trait databases have increasingly become available (van Zuijlen et al. 2023 and references therein), but how bryophyte functional traits and trait relationships vary with environmental variation is not well known (Wang et al. 2022). Most traits used in bryophytes have been derived from empirically derived Ellenberg values for species ecological preferences (van Zuijlen et al. 2023). Revised values for N preference have been provided for Central Europe (Simmel et al. 2021), but a critical reassessment of all Ellenberg values based on quantitative data, fuelled by the increasing availability of environmental data across large scales (Panagos et al. 2012, Amatulli et al. 2018, Lembrechts et al. 2020, would be desirable. Furthermore, functional-trait relationships for bryophytes can differ between geographical locations (Wang et al. 2022), as shown, for example, by significant trait differences between island and continental populations (Patiño et al. 2013), calling for the development of regional databases extended to areas located beyond Europe.
Funding -FC was funded by a grant from the Swiss National Science Foundation (SNSF) awarded to AG and AV (grant no. 197777). AV was funded by the Fund for Scientific Research (F.R.S.-FNRS).

Transparent peer review
The peer review history for this article is available at https:// publons.com/publon/10.1111/ecog.06721.

Supporting information
The Supporting information associated with this article is available with the online version.