Novel methods to correct for observer and sampling bias in presence- only species distribution models

Aim: While species distribution models (SDMs) are standard tools to predict species distributions, they can suffer from observation and sampling biases, particularly presence-only SDMs, which often rely on species observations from non- standardized sampling efforts. To address this issue, sampling background points with a target- group strategy is commonly used, although more robust strategies and refinements could be implemented. Here, we exploited a dataset of plant species from the

Over the last two decades, SDMs have gained increasing popularity among researchers, with studies investigating and comparing SDM methods and their specific assets, as well as evaluating the reliability of their predictions compared to field data Elith et al., 2006;Guisan & Thuiller, 2005;Guisan et al., 2017).
Currently, SDMs based on presence-only data enjoy high popularity, due to the rapid growth of online databases of species observations from scientific, naturalist and citizen-science initiatives (Samy et al., 2013;Wüest et al., 2020). There are several types of presence-only SDMs, and one of the most well-known methods is Maxent (Phillips et al., 2006), a special case of point-process model (PPM; Warton & Shepherd, 2010). PPMs are common tools to model presence-only data in other fields (e.g. seismology, epidemiology, neurology and economics), and they have been recently introduced in ecology as a type of presence-only SDM. Being proportional to Maxent, but with many additional advantages (Renner et al., 2015;, this method is becoming one of the tools of choice for presence-only models. Presence-only SDMs are an appealing tool, but their implementation necessitates following good modelling practice (Araújo et al., 2019), with correcting for observer bias often being regarded as the most important part (Graham et al., 2004;Phillips et al., 2009).
Observer or sampling bias of species records results from excessive effort in distinct geographic regions of the study area; for example along roads, coasts, rivers, towards low elevations, near towns or biological field stations (Chauvier et al., 2021;Fithian et al., 2015;Graham et al., 2004;Reddy & Dávalos, 2003). Moreover, bias of observational datasets in geographic space generally results in bias in environmental space (Bystriakova et al., 2012;Graham et al., 2004;Phillips et al., 2009). This may lead to non-representative ecological preferences of species, and, in turn, distorted SDM fits and predictive outputs, particularly if a part of the species environmental niche is greatly over-sampled (Hastie & Fithian, 2013). However, various methods of bias correction have been suggested and combined in the literature. For example, spatial thinning of species observations according to the study area's resolution (Aiello-Lammens et al., 2015;Kiedrzyński et al., 2017;Steen et al., 2020), model-based corrections (Komori et al., 2020;Stolar & Nielsen, 2015), or combining species observations with large survey data Fletcher et al., 2016) have been recommended. Yet overall, sampling background data with the target-group strategy-that is, using similar sampling design/bias to sample background points as observed species presence points-remains currently the most popular approach (Botella et al., 2020;Hertzog et al., 2014;Kramer-Schadt et al., 2013;Phillips et al., 2009;Righetti et al., 2019). While this approach seems to improve model predictive performance, limitations regarding the number of background points to sample, and how to define target groups remain (Cerasoli et al., 2017;Warton et al., 2013), and might lead to lack of observer bias correction (Hanberry et al., 2012;Hertzog et al., 2014;Iturbide et al., 2015).
In response to these limitations, new corrective strategies based on bias covariate correction (BCC) have recently been implemented for presence-only models (Merow et al., 2016;Warton et al., 2013).
During the model calibration and on top of the environmental variables, one or several bias covariates are added to account for imperfect sampling. These covariates are then kept constant (mean or zero) when predictions are made. By this, the parameters estimated during the model inference are corrected for the observer bias.
Distance to/density of roads and cities (here called Classic-BCC; Bonnet-Lebrun et al., 2020;El-Gabbas & Dormann, 2018a, 2018bWarton et al., 2013) have been shown to be particularly useful bias covariates that can improve predictive outputs. These improvements, however, strongly depend on whether the chosen bias covariates manage to capture the geographic bias existing in the data.
Efficient bias covariate correction is therefore strongly dependent on preliminary spatial diagnostics to clearly uncover the cause of geographic bias (Albert et al., 2010;Amano & Sutherland, 2013;Warton et al., 2013). This need could be overcome if the selected bias covariates were to directly summarize the patterns of effort bias in the data, allowing for automated BCC implementation.
Here, as a first step, we attempt to integrate the whole density of observations in the study area (i.e. target-group observation density; Figure 1a) as bias covariate within PPMs, alone (referred to as Target-BCC), and in combination with Classic-BCC (referred to as Mixed-BCC).
As already mentioned, correcting observer bias in presenceonly models is increasingly done with BCC. Still, this approach does not address the potential environmental bias that may occur in the sampling design of observational datasets. Before data collection, appropriate sampling design should be environmentally stratified (Albert et al., 2010;Austin & Heyligers, 1989;Hirzel & Guisan, 2002;Mohler, 1983). Indeed, sampling frequencies in environmental space may still remain skewed if species observations are not initially sampled according to an environmental stratification. Moreover, BCC effectiveness is generally known to decay proportionally to the correlation between observer bias and environmental predictors . For example, if a bias covariate is strongly correlated with temperature, correcting predictions for the calibrated observer bias would also correct predictions for the calibrated temperature, thus skewing the species response. As a second step, we thus address this limitation within a novel corrective method based on random stratified sampling (referred to as environmental bias correction or EBC). Correcting sampling bias by filtering, before model calibrations, single-species observations based on environmental conditions has previously been addressed, with contrasting results (Fourcade et al., 2014;Varela et al., 2014). Our method extends such developments by correcting potential environmental bias by applying an artificial environmental stratification to the whole observational dataset. More specifically, environmental clusters that represent the main environmental conditions are generated across the study area. Within each cluster, the original observations of species are then concurrently subsampled without altering their ecological preferences. Depending on the research question or system studied, the type of random stratified sampling selected may be equal (sampling size equal across clusters) or proportional (sampling size proportionate to each cluster's area) across the study region (Boschetti et al., 2018;Hirzel & Guisan, 2002;Williams & Brown, 2019).
In this study, we tested the potential of the two above-mentioned EBC strategies as well as the Target-BCC approach to correct for observer and sampling bias in presence-only SDMs. More precisely, we investigated the effectiveness of 'EBC equal-stratified' (EBCe) and 'EBC proportional-stratified' (EBCp) strategies, along with different BCC strategies considering Classic-, Target-and Mixed-BCC, to model the distribution of 1,900 plant species across the European Alps. To provide a valuable comparative analysis, we evaluated the SDMs with both spatial block split-sampling (BSS) and an independent test dataset (Flora Alpina; Aeschimann et al., 2004). From the model projections, we generated diversity maps for each corrective strategy and compared them to a map of expert-based plant diversity of the European Alps.

| Study area
The study area covered the European Alps, as defined by an enlarged version of the official Alpine Convention perimeter (Permanent Secretariat of the Alpine Convention, 2009). The enlargement consisted of adding Switzerland entirely, as well as two French departments, Ain and Bouches-du-Rhône, for which we had extremely well-documented species observations. The size of the total study area is 294,994 km 2 , and includes a wide range of climatic and topographic conditions.

| Species observations
The observational dataset used in this study was compiled from more In total, the original observational dataset included 3,560 species ( Figure 1a). This set was further filtered according to the prevalence F I G U R E 1 Distribution of species observation densities across the extended European Alps. (a) shows the whole target group observation densities (i.e. Target group v1.0 layer). It includes 6,523,980 observations for 3,560 species. (b) shows observation densities used to model species distributions. It represents 5,369,442 observations for 1,900 species. Distribution of species densities was aggregated at 3-km resolution for better visual representation and log transformed of each species (or proportion of 1-km pixels occupied); that is, species occurring in fewer than 100 pixels across the study area were removed. In total, the refined observational dataset included 1,900 species (Figure 1b) for model calibration (see Supporting Information Appendix S1: Table S1 for information on data distribution ranges).
It is important to note that for species with >10,000 observations, we sampled randomly without replacement a subset of 10,000 observations for better computation efficiency (following Thuiller et al., 2018).
Additionally, an independent and unbiased test dataset, reporting the empirical distributional range of our 1,900 plant species over the European Alps, was constructed from expert-based information available in the Flora Alpina (FA) for a total of 54 political units (Aeschimann et al., 2004). For each species, we generated binary FA presence/absence rasters at a 1-km resolution by intersecting the polygons of its expert-based native political units with its expert-based elevation preference (see Supporting Information Appendix S1: Table S2, Figure S1 for more information). The 1,900 species binary layers were used for independent model evaluation, and to construct a map of expert-based plant diversity of the European Alps.

| Environmental data
We extracted all 19 bioclimatic variables from the Climatologies at High resolution for the Earth's Land Surface Areas (CHELSA) portal (Karger et al., 2017, http://chels a-clima te.org/), available for the time period 1979-2013. All predictors were kept at a 1-km spatial resolution and projected to the standard Lambert azimuthal equal area projection for Europe (EPSG:3035). A principal component analysis (PCA) was computed among all 19 variables to obtain PCA axes summarizing the main climatic patterns. We kept the first five PCA axes that cumulatively explained >90% of the total variance.

| Model calibration
We used point-process models (PPMs), where model output represents the intensity of the expected number of species occurrences per unit area, which is modelled as a log-linear function of the environmental covariates (Renner et al., 2015). PPMs were calibrated as a 'down-weighted Poisson regression' (DWPR, following Renner et al., 2015), using a generalized linear model (McCullagh, 1984) with second-order polynomials for all covariates (see Supporting Information Appendix S1: Text S1 for more details). Quadrature points (commonly referred to as background data) were here used to estimate the model log likelihood (see Supporting Information Text S1), and sampled randomly without replacement across the study area over a 1-km regular mesh. We estimated for each PPM the appropriate number of quadrature points by running 10 repeated series of DWPR while gradually increasing the number of randomly sampled points from 500 to 800,000, as explained in Renner et al. (2015) (in Figure 2b). The number of quadrature points at which the log likelihood converged was then automatically kept (see Supporting Information Appendix S1: Figure S2 and Text S1).

| Bias covariate correction (BCC)
In total, we generated six bias covariates approximating survey effort. For Classic-BCC, four covariates were based on roads and cities. Distance to roads (a) and to cities (b) were generated based on OpenStreetMap (https://www.opens treet map.org). All roads and cities of our study area were extracted from this source and converted into two binary 100-m grids. Distances to roads and cities were, respectively, calculated with the Geospatial Data Abstraction Library (GDAL; https://gdal.org/) and aggregated by sum to 1-km res-  Implementing BCC in PPMs allows for modelling species observation intensities both as a function of environmental predictors, and of an assumed observer bias Warton et al., 2013, see Supporting Information Appendix S1: Text S2 for formula). Once calibrated, all models were spatially projected, with bias covariates being set to a constant value of 0 for all cells to correct for the fitted observer bias (following Warton et al., 2013). It is important to note that all bias covariates were weakly correlated with all climate PCA predictors; that is, Pearson's |r| < .33 (see Supporting Information Appendix S1: Figure S4 for more details). Environmental effects were therefore hardly masked by observer-bias effects during model calibration .

| Environmental bias correction (EBC)
The model design described so far was repeated for all 1,900 species with and without an additional, novel corrective method based on random stratified sampling (EBC). Stratified sampling design of species observations may be equal or proportional (Boschetti et al., 2018;Hirzel & Guisan, 2002;Williams & Brown, 2019). We therefore accounted within our models for three different strategies:

| Model evaluation
We assessed the predictive performance of each PPM under fivefold spatial block split-sampling tests (BSS; Roberts et al., 2017).
This approach requires preliminarily delineating independent spatial blocks to partition observations in geographic space. Here we evenly partitioned each set of species observations and quadrature points into 10 blocks and combined them into five folds (two blocks per fold were combined so that differences in number of observations across folds were minimal). To this end, we first ran a partitioning around medoids clustering on the observation coordinates and, in a second step, k-nearest neighbour classification to assign quadrature points to observation blocks (see Brun et al., 2020 for more details). Model evaluation was also performed against the independent Flora Alpina dataset. For each PPM, the same number of Flora Alpina presences/absences as quadrature points were sampled, to ensure balanced repartitioning among folds. To make the F I G U R E 2 Model framework of the study. (a) shows the pipeline applied for our analysis in order to generate 684,000 projections for 1,900 species. For each species, 12 types of bias covariate correction (BCC) were applied, and each model was calibrated with three types of environmental bias correction (EBC); no (No-EBC), proportional-stratified (EBCp) and equal-stratified EBC (EBCe). Two types of model evaluation were performed: with fivefold spatial block split-sampling (5-fold Block SS) and the Flora Alpina independent dataset (FA). Each projection was converted to binary presences/absences using two different estimates of the maximum true skill statistic threshold (maxTSS); that is, for each evaluation. (b) describes the 12 BCC strategies used along five climate principal component analysis (PCA) axes to calibrate the point-process models (PPMs) two evaluation strategies comparable, the same block structure was applied to the Flora Alpina presences/absences (see Supporting Information Appendix S1: Figure S7 for spatial block split-sampling details). PPM performance was thus concurrently calculated using presences/quadrature points, and Flora Alpina presences/absences of the left-out fold, for spatial block split-sampling and Flora Alpina evaluations, respectively.
Model predictions were evaluated with the true skill statistic (TSS; Allouche et al., 2006). For each fold, projected observation intensities were converted to a binary presence/absence raster using two different estimates of the maximum TSS threshold (maxTSS); that is, one per evaluation type. For each species, EBC-BCC strategy, and evaluation, binary rasters of each fold were summed, and presences were assigned to pixels when at least half of the folds predicted so (Araújo & New, 2007;Guisan et al., 2017;Thuiller et al., 2009). In total, we generated and stacked 1,900 binary rasters of species presence per EBC-BCC strategy (n = 36) and evaluation type (n = 2), obtaining 72 maps of modelled species diversity over the European Alps that we could compare to an expert-based map.
Finally, we used nonparametric Friedman tests to detect differences in model predictive performances (Friedman X 2 ) between all BCC treatments tested in our study, when EBCp/e applies or not.

| RE SULTS
When No-EBC was considered, models including No-BCC showed higher TSS than models including Target-BCC for the split-sampling evaluation (Friedman X 2 = 546.1, p-value < .001, Figure 4b), but lower TSS for the independent Flora Alpina evaluation (Friedman X 2 = 4,313.4, p-value < .001, Figure 4e). Classic-BCC had overall the same performance as No-BCC models, while Mixed-BCC models showed similar performance to that of Target-BCC models. All results were confirmed with the Area Under the ROC Curve (AUC) and Boyce index tests (see Supporting Information Appendix S1: Figure S8, Table S3.

F I G U R E 3 Methodology framework of the environmental bias correction (EBC). (a)
summarizes with a 10-colour palette the 100 environmental clusters obtained using the clustering large applications (CLARA) method for better visualization. (b) defines the EBC methodologies that were applied in the study for 1,900 species. Each cluster j (matrix columns) summaries per species m (matrix rows) the original number of observations (yellow box). Equal-stratified EBC (EBCe) here multiplies and divides each cluster j, by the total number of observations across all species in cluster no. 31 and j, respectively (black terms in Equation 1). Proportional-stratified EBC (EBCp) additionally multiplies each cluster j by its area in pixels (added green term in Equation 1). Results indicate a new number of observations (EBCe or EBCp) per cluster and species (blue box), which are subsampled with replacement from the original observational dataset. (c) summarizes per cluster the observation densities in the different datasets, for uncorrected (yellow shade), EBCp (green shade) and EBCe (blue shade) corrections, relative to the cluster distribution across the whole study area (grey shade). For better visualization, 200,000 and 180,000 cluster values were sampled randomly without replacements over the study area and for the three sets of observations (uncorrected, EBCp and EBCe corrected). For more clarity, an interpolation spline between palette counts was applied Projected species diversity for No-BCC models was more biased towards Switzerland under split-sampling than independent evaluation (Figure 4a,d). Target-BCC decreased this spatial bias for both evaluation strategies, although more strongly for the independent evaluation (BCCtg1: Figure 4c,f; BCCtg2: see Supporting Information Appendix S1: Figure S9a). Indeed, Target-BCC projected diversity displayed stronger correlations with the expert-based diversity for independent evaluation (r (BCCtg1) = .32, r (BCCtg2) = .35) rather than under split-sampling (r (BCCtg1) = −.12, r (BCCtg2) = −.11; see Supporting Information Appendix S1: Figure S10).
When EBC was considered, the quality of model predictions and correlations with expert-based diversity increased, and differences in model performances between all models decreased. For split-sampling evaluations, models including Target-BCC performed similarly to models including No-BCC (EBCp: Friedman X 2 = 372.2, pvalue < .001, Figure 4h; EBCe: Friedman X 2 = 257.3, p-value < .001, see Supporting Information Appendix S1: Figure S11b Figure S10).

| DISCUSSION
In this paper, we compared different strategies to correct for observer and sampling bias in presence-only SDMs, and demonstrated that model predictions of plant species distributions in the European Alps are considerably improved when EBC is implemented (Figures 4   and 5). While BCC focuses on the correction of observer bias via covariate adjustment, EBC implements a complementary correction based on random stratified sampling (Austin & Heyligers, 1989;D'Antraccoli et al., 2020;Hirzel & Guisan, 2002;Mohler, 1983).
Environmentally imbalanced sampling designs in observational datasets is a recurrent issue in ecological studies (Albert et al., 2010;Hirzel & Guisan, 2002). EBC corrects for this environmental bias by applying an artificial environmental stratification to the observational dataset through two design variants (EBCp, EBCe). This stratification corrects over all species observations the excessive sampling effort detected in specific environments of the study area, allowing known limitations regarding BCC to be overcome. Bias covariate correction works generally well to address geographic observer bias (Bonnet-Lebrun et al., 2020;El-Gabbas & Dormann, 2018a;Warton et al., 2013). However, this correction is only meaningful (a) if the observer bias covariate remains weakly to moderately correlated with the environmental predictors used in the model , and (b) if it adequately describes the origin of the bias in the data. EBC overcomes both limitations and may therefore be implemented independently to BCC.
Nevertheless, EBC should (a) only be applied if environmental bias is apparent, (b) ideally implemented for specific species, and (c) preserve the influence of the environmental cluster in which the species was originally most sampled. (a) Environmental bias in observational data is usually a consequence of spatial bias (Bystriakova et al., 2012;Phillips et al., 2009). The observational dataset must therefore be carefully explored in geographic and environmental space prior to analysis to detect potential environmental biases. Here, this bias was geographically ( Figure 1) and environmentally (Figure 3) detected, with disproportionately high sampling within the administrative border of Switzerland. (b) Subsampling species observations in environmental space should not be applied to species whose environmental bias strongly differs from the overall one (i.e. if the number of original species observations per cluster is weakly correlated with that of the full dataset). In our analysis, this correlation was strong for Abies alba (r = .84), but weak for Pinus cembra (r = .13; Figure 6). Therefore, while EBC corrected the environmental bias of the former, it distorted the originally well-distributed observations of Pinus cembra in the latter, correcting for a non-existent environmental bias. (c) After implementing EBC, the cluster in which the species was originally most sampled no longer necessarily contains the highest number of subsampled observations. Subsampled observations may indeed be relatively more frequent in other clusters with originally stronger under-sampling. Although such corrective behaviour is pursued, one could require per species the cluster with the most original observations to be as representative as the cluster with the most corrected observations, especially if EBCp is applied (see arrows and Abies alba, Datura stramonium; Figure 6). Such optional adjustment could be particularly beneficial for SDM studies seeking to preserve the environmental influence of heavily sampled regions. In respect of (b) and (c), further descriptions and codes addressing such limitations are detailed in Supporting Information Appendix S2.
Although EBC may be implemented independently, we found that combining it with bias covariate correction yields better predictive performance and better agreement with our independent diversity map (Figures 4 and 5; as predicted in Table 1). Unless strong correlations between the bias and environmental covariates are found, BCC should be implemented together with EBC, as bias covariate correction remains strongly relevant if EBC does not initially succeed in correcting for the environmental bias (see Pulmonaria obscura; Figure 6). Particularly, Target-BCC improved model performance/ predictions compared to Classic-BCC, with BCCtg1/2 producing very similar results (Supporting Information Figures S9-S11); that is, an approximate density map of observations (Target group v2.0) is as efficient as the pixel-by-pixel target group observation density of the study area (Target group v1.0) when applying Target-BCC. While this correction can improve and simplify the implementation of BCC, we advise Mixed-BCC to be used by default, as Classic-and Target-BCC could perform differently depending on the observational dataset, and their combination may correct more robustly. In our study, Target-BCC likely performed best because the survey effort across our study area showed strong large-scale variations, with peaks in Switzerland and southern France. Although these regional or administrative biases are quite common across observational datasets, superimposing biases related to accessibility (e.g. roads, coasts, elevation or cities) are also frequent, and often species-specific.
Including concurrently covariates targeting different types of bias should therefore make BCC more adaptive and performant.
Employing semi-dependent (spatial split block sampling) and independent (Flora Alpina dataset) evaluation strategies, as well as corresponding estimates of the maxTSS, revealed an 'evaluation F I G U R E 4 True skill statistic (TSS) of species distribution models when 12 bias covariate corrections (BCC), no environmental bias correction (No-EBC) and proportional-stratified EBC (EBCp) apply, for both block split-sampling (BSS) and Flora Alpina (FA) evaluations (b, e, h, k). Related species diversity (1,900 species) of No-BCC and BCC Target group v1.0 (BCCtg1) are shown on the left and right panels respectively. Projected diversity is described here for No-EBC (a, c, d, f) and EBCp (g, i, j, l). The 12 BCC are categorized as follows: No-BCC (orange), Classic-BCC (red), Mixed-BCC (blue) and Target-BCC (green). Friedman tests were here applied for each of the four centre panels (***p-value < .001 for all): Friedman X 2 = 546. 1, 4,313.4, 372.2 and 2,144.2 (b, e, h and k, respectively). All pairwise comparisons were run with post-hoc Nemenyi tests and displayed following a letter-based representation (p-value > .05) paradox', which is conditional on whether the test data are truly independent. In agreement with other studies, we confirm that non-corrected model predictions seem to perform better than corrected ones, if the observations used for testing and calibrating originate from the same spatially biased observational dataset (Smith et al., 2021;Stolar & Nielsen, 2015;Warton et al., 2013). This paradox was particularly apparent under split-sampling evaluation, and when comparing No-BCC with Target-BCC models (Figure 4a-f).
This paradox may be quite detrimental if SDMs are selected based on TSS, or other evaluation metrics, calculated from test data lacking independence. This may lead to inappropriate model selection, and consequently to erroneous spatial predictions of species distributions. Interestingly, when EBC is applied, the amplitude of the paradox strongly diminishes, and similar evaluation performances are obtained for split sampling and independent tests when predicting plant diversities (Figure 4g-l). This means that implementing EBC decreases the risk of inappropriate evaluations and therefore predictions, regardless of the evaluation procedure.
However, the correlations we found between predicted and expert-based diversities ( Figure 5) remained moderate, and this may have different reasons: (a) the expert-based diversity is only based on filtered coarse political units, which may not represent F I G U R E 5 Comparison between expert-based species diversity (Flora Alpina) and Target-BCC (BCC Target group v1.0; BCCtg1) projected and binarized (maximum true skill statistic threshold, maxTSS) diversity, when proportional-stratified EBC (EBCp) is applied for split sampling and Flora Alpina evaluation. (a) represents the expert-based plant diversity constructed from the independent test dataset. (b) and (d) represent projected diversity maps found in Figure 4i and 4l, respectively. (c) and (e) show the statistical relationship between expert-based diversity and projected diversity, for block split-sampling (BSS) and Flora Alpina (FA) evaluation. For this analysis, 10,000 values were spatially sampled randomly without replacement per diversity layer. Both panels display Spearman's rank correlation coefficient r, and the explained deviance D 2 from a fitted Poisson-based generalized linear model (GLM) with first-order polynomials the true and detailed shape of plant diversity in the European Alps.
To simplify model comparisons and computations, (b) our study did not remove species models with comparably low TSS, and (c) only climate predictors were used, whereas soil and land cover have been shown to be essential for predicting more accurately plant species distribution (Chauvier et al., 2021). Finally, it is essential that sampling design is based on prior knowledge of the system studied (Albert et al., 2010;Arneill et al., 2019;D'Antraccoli et al., 2020). If available, EBC can include other prior variables (than climate predictors) that are thought to have an influence on the species distribution (e.g. longitude, latitude, elevation or habitat suitability maps). No consensus exists on the best random stratified sampling design; however, not all designs should be based on the environmental space of the study area. The same problem arises when ecological studies need to choose between a proportional or equal-stratified design when inventorying or collecting species samples (Hirzel & Guisan, 2002). Accordingly, EBCp and EBCe may be both implemented via our R function (R Core Team, 2020; see Supporting Information Appendix S2).
To conclude, we propose that combining both EBC and bias covariate correction might be the best choice in presence-only SDMs when predicting current and future species distributions. Given the F I G U R E 6 Summary of observation densities per cluster for four selected species, using uncorrected (yellow shade), proportional (EBCp; green shade; first column) and equal-stratified EBC (EBCe; blue shade; second column) corrected data. The data are shown relative to the cluster distribution across the study area (grey shade). Two thousand cluster values were sampled randomly without replacements within the study area and for the three sets of observations (uncorrected, EBCp and EBCe corrected). For more clarity, an interpolation spline between palette counts was applied. Arrows indicate the difference between EBC corrected observations of the densest original cluster (with the highest number of original observations) and the densest corrected cluster (with the highest number of corrected observations). The third column displays Pearson's correlations between the number of original observations per cluster for Abies alba, Pinus cembra, Pulmonaria obscura and Datura stramonium (first, second, third and fourth row, respectively) and that of the full dataset (all; see Figure 3c). CLARA = clustering large applications growing interest in using citizen-science data to follow and predict invasive species, to guide the reintroduction of species, or to protect biodiversity (Grünig et al., 2020;Hunter-Ayad et al., 2020;Lehtomäki et al., 2019), using these corrections may prove to be strongly beneficial for future ecological studies, which will increasingly implement presence-only models. Interestingly, all methods presented here may be extended independently to presence-absence data not subject to the same observer bias, and might turn out to be useful for further ecological applications. ANR-10-LAB-56). Furthermore, we thank all various contributors for sharing numerous datasets of species observations, and for making possible the large scope of this study. We also thank P.

ACK N OWLE D G M E NT S
Descombes, C. Botella and S. Schiess for their useful inputs in data analysis and discussion.

AUTH O R CO NTR I B UTI O N S
Y.C., N.E.Z. and W.T. conceived the general idea and designed the study with the help of all authors; Y.C., G.P., D.B. and W.T. developed the methodology; Y.C. performed the analysis and led the writing of the manuscript. All authors interpreted results, significantly contributed to writing and editing, and gave final approval for publication.

DATA AVA I L A B I LIT Y S TATE M E NT
All non-copyright data and scripts supporting the findings of this study are available on the EnviDat repository (https://www.envid at.ch/datas et/corre ct-obser ver-bias-only-sdms Target-group bias covariate correction (Target-BCC) uses the full observation density across the study area as bias covariate (see Figure 1a). Environmental bias correction (EBC) uses a pre-processed and balanced sampling of all species observations within the clustered environmental space of the study area.

S U PP O RTI N G I N FO R M ATI O N
Additional supporting information may be found online in the Supporting Information section. Thuille and Philipp Brun he is involved in the 'OriginAlps' project that aims at improving the understanding of historical and contemporary processes that drive patterns of plant biodiversity in the European Alps.