Spatially explicit species distribution models: A missed opportunity in conservation planning?

Systematic conservation planning is vital for allocating protected areas given the spatial distribution of conservation features, such as species. Due to incomplete species inventories, species distribution models (SDMs) are often used for predicting species’ habitat suitability and species’ probability of occurrence. Currently, SDMs mostly ignore spatial dependencies in species and predictor data. Here, we provide a comparative evaluation of how accounting for spatial dependencies, that is, autocorrelation, affects the delineation of optimized protected areas.


| INTRODUC TI ON
In the light of the ongoing decline in global biodiversity (Pimm et al., 2014), the implementation of protected areas in the terrestrial, marine and freshwater realms is yet the most widely used conservation approach to reduce the loss of biodiversity. Consequently, how to delineate protected areas so that they produce optimal outcomes for the targeted conservation features in a cost-effective way has been widely covered in the systematic conservation planning literature (Barnes, Glew, Wyborn, & Craigie, 2018;Margules & Pressey, 2000).
In the past, conservation goals mainly reflected habitat or species representation. Only recently the focus has shifted towards considering environmental and ecological processes, which are essential for securing species persistence (e.g., Klein et al., 2009). Such processes shape the distribution and abundance of species (Pressey, Cabeza, Watts, Cowling, & Wilson, 2007) with connectivity playing a paramount and distinct role in terrestrial (Lockwood, 2010), marine (Carr et al., 2003), and freshwater ecosystems (Hermoso, Filipe, Segurado, & Beja, 2018). Incorporating spatial connectivity in the planning process has, therefore, important implications for designing protected areas (Daigle et al., 2018;van Teeffelen, Cabeza, & Moilanen, 2006;Weeks, 2017). This fact is also reflected in the software that is used in conservation planning, such as marxan (Ball, Possingham, & Watts, 2009) or zonation (Lehtomäki & Moilanen, 2013). All of them base, among other parameters, the selection of potential planning units on algorithms that account for their spatial connectivity.
When accounting for spatial connectivity, conservation plans inherently build the protected areas based on the spatial dependencies in the planning units as well. For example, it is vital to account for the spatial structure of the environment around a given planning unit for assessing its importance as part of a protected area (Daigle et al., 2018;Weeks, 2017). However, this key characteristic of spatial dependency is rarely applied in the underlying conservation features themselves, which provide the basis for the conservation planning.
The most widely used conservation feature in conservation planning is the geographical distribution of multiple species, as species distributions are often best known compared to other biodiversity facets (such as functional or phylogenetic characteristics of species; McGill, Dornelas, Gotelli, & Magurran, 2015). However, species point occurrence data alone are not useful in systematic conservation planning, which requires range-wide and seamless data of the targeted conservation features (Tulloch et al., 2016). This is likely one of the major challenges for conservation planners-to overcome the Wallacean shortfall, that is, to know the full geographical distribution of species (Bini, Diniz-Filho, Rangel, Bastos, & Pinto, 2006;Meyer, Weigelt, & Kreft, 2016).
Currently, species distribution models (SDMs) are the main tools used to produce such range-wide species distribution data (Guisan & Thuiller, 2005). SDMs assess habitat preferences in an environmental envelope given the species occurrences and the environmental predictors at the respective locations, and project a probabilistic habitat suitability index across the study area (Elith & Leathwick, 2009). Species distribution models are considered useful in conservation planning when used within a "structured and transparent decision-making process" (Guisan et al., 2013). In reality, however, most conservation plans are still based on species' surrogates such as habitat maps, expert-derived species distributions, or connectivity surfaces (Tulloch et al., 2016). As reviewed by Tulloch et al. (2016), SDMs are not yet widely used in conservation planning because of several constraints: (a) poor availability of species occurrence data and (b) respective predictor data, (c) uncertainties associated with the data and (d) the wish to understand processes rather than patterns, requiring alternative and additional data. Here, (a-c) are clearly methodological issues that need to be assessed with care, while (e) goes beyond the expectations towards SDMs, since per definition, they do not deliver information on processes. Hence, it is key to distinguish between constraints that can be tackled (Dormann, 2007;Fourcade, Besnard, & Secondi, 2018;Pearson & Dawson, 2003), and those that are beyond the actual purpose of SDMs (Araújo & Peterson, 2012).
Species distribution models require rigorous testing for methodological issues and statistical shortcomings (Record, Fitzpatrick, Finley, Veloz, & Ellison, 2013;Tulloch et al., 2016). Given that spatial conservation plans incorporate spatial information among planning units, it is obvious that SDMs that yield the conservation feature data should make use of the same information. However, the few studies that modelled and used species distributions for conservation planning are typically not accounting for such spatial dependencies (e.g., in the terrestrial (Rondinini et al., 2011), freshwater (Esselman & Allan, 2011) and marine realms (McGowan et al., 2013), but see White, Schroeger, Drake, and Edwards (2014)). The disadvantage of such spatially implicit SDMs (hereafter referred to as "non-spatial SDMs") is that they do not account for spatial dependence or irregular sampling intensity (e.g., Latimer, Wu, Gelfand, & Silander, 2006), and frequently violate the assumption of independent samples and spatial units (Araujo, Pearson, Thuiller, & Erhard, 2005;Hampe, 2004). Assuming independence of spatial units means that the model is not aware of the predicted habitat suitability of a given species in neighbouring planning units. In contrast, the optimization of the spatial conservation plan, that is, the selection of planning units, does account for the spatial relation among them, favouring neighbouring ones over those that are far apart.
In contrast, spatially explicit SDMs ("spatial SDMs") account for the proximity and mobility (i.e., connectivity) in species populations. Hence, they provide more powerful inference about species distributions and niche relations (Latimer et al., 2006;De Marco, Diniz-Filho, & Bini, 2008). Incorporating the assumption of spatial dependencies in the data has, therefore, the potential to provide more robust SDM predictions (Guisan & Thuiller, 2005;Record et al., 2013). Most importantly from a conservation planning perspective, spatial SDMs make use of the spatial information and dependencies of the species and environment in the planning units. Habitat suitability predictions from spatial SDMs are generally more contiguous (Domisch, Wilson, & Jetz, 2016) and less patchy than those derived from non-spatial SDMs. The question remains, whether such contiguous predictions have the potential to maximize the management efficiency of spatial plans.
Species distribution models can be made spatially explicit in numerous ways (see Dormann, 2007, for a review of various methods).
Spatial aspects are included in the SDM given the spatial autocorrelation of species distributions (Besag, 1974;Record et al., 2013). Spatial autocorrelation is a common phenomenon in ecology, as nearby locations (given species occurrences and environmental predictors) tend to be more similar compared to those located farther apart (Legendre, 1993;Record et al., 2013). One possible method to add the spatial dimension in SDMs can be, for instance, through spatial random effects (Dormann, 2007). Spatial random effects can be computationally demanding. However, their use may be warrantedor even required-when the task is to limit the false positive and false negative predictions of probability of occurrence within a spatial unit (Record et al., 2013). Limiting false positive/negative predictions is crucial in conservation planning, since the over-or underestimation of suitable habitats (and therefore species occurrences) impacts species protection itself and produces unnecessary costs, that is, for establishing protected areas in erroneous locations where the target species do not actually occur.
Here, we analyse whether accounting for spatial dependencies in both steps, the spatial prioritization process and in the underlying conservation features (here: species), influences the arrangement of potential protected areas. Using species survey data, we build nonspatial and spatial SDMs and compare the resulting mapped spatial conservation plans across a range of conservation targets, that is, proportions of potential species distribution areas. We hypothesize that (a) spatial SDMs would outperform non-spatial SDMs in terms of model evaluation scores, since spatial SDMs account for the influence of proximity in species populations; (b) this effect cascades through to the spatial conservation plans, and that those derived from spatial SDMs would differ significantly from the ones based on non-spatial SDMs. We test these hypotheses in three case studies covering terrestrial, marine and the freshwater realms. We do so, because protected areas in each realm are typically planned using F I G U R E 1 Spatial arrangement of potential protected areas derived from non-spatial and spatial SDMs, as well as their overlap in the terrestrial (a), marine (b) and freshwater realms (c), each using a specific planning unit shape (grids, hexagons, subcatchments). Spatial plans were defined for a 20% conservation target within a 10% gap to optimality. different shapes of spatial units (grids, hexagons and subcatchments, respectively), and the planning unit shape has shown to impact the spatial pattern and effectiveness of protected areas (Nhancale & Smith, 2011).

| General workflow of the analysis
Our aim is to test whether, and if so, how the addition of spatial information in the SDMs leads to different species habitat suitability predictions, and hence to different spatial conservation plans. The analysis is divided into five stages: (a) preparing the data and aggregating it into spatial units; (b) testing the degree of spatial autocorrelation in the species data to see whether adding spatial random effects in the SDMs is warranted; (c) running two SDMs for each species, one that does not use the spatial information (non-spatial SDM) and another one that takes advantages of the spatial information via the spatial random effects (spatial SDM); (d) using the outputs from the spatial and the non-spatial SDMs as separate inputs to calculate spatially optimized conservation plans with varying conservation targets; and, finally, (e) comparing the spatial conservation plans derived from the non-spatial and the spatial SDMs. As the nonspatial and spatial SDMs are, except for the spatial random effects, identical, this allows us to single out the effect that the spatial information in SDMs has on the subsequent spatial conservation plans.

| Study areas and species data
The terrestrial case study ( Figure 1a) comprised survey data of 33 Eucalyptus species in Eastern Australia derived from Fithian, Elith, Hastie, and Keith (2015). This dataset encompasses species detections/non-detections sampled annually from 1970 to 2013 as well as environmental data covering climate and topography and derivatives thereof (Fithian et al. 2015). Environmental data layers were re-sampled from the original 250-m spatial resolution by factor 20 to obtain a grid with the individual squares sizing 0.05° (approximately 25 km 2 ), using the raster package in R ( Environmental data on ocean topography, currents, nutrients and light were obtained from the Bio-Oracle (approx. 9 km spatial grain at the Equator, Tyberghein et al., 2012;Assis et al., 2018) and marsPeC datasets (approx. 1 km, Sbrocco & Barber, 2013).
All environmental data were harmonized to 7,123 hexagonal grids of approx. 19 km 2 , and the species data were summarized for each hexagonal grid cell (created using the R- The freshwater case study ( Figure 1c) was based on detection/non-detection data of 85 fish species across the Danube river basin. Fish survey data were derived from the EFI+ and the BioFresh project databases, sampled during 1955-2007 (Schinegger, Pletterbauer, Melcher, & Schmutz, 2016;Zupancic, 2015). We used the HydroBASIN dataset (Lehner & Grill, 2013) and selected all level 12 subcatchments draining into the Black Sea, comprising of 7,376 subcatchments with an average size of 108 km 2 . We then extracted climatic (Hijmans, Cameron, Parra, Jones, & Jarvis, 2005), topographic (Amatulli et al., 2018) and land cover variables (Tuanmu & Jetz, 2014, all at ~1 km spatial grain), as well as the number of dams (Lehner et al., 2011) for each subcatchment, the latter representing a proxy for human impact besides land use.
Species data, including species detections, non-detections and sampling dates, were aggregated to the planning units. Within each planning unit, environmental data were aggregated using various techniques (e.g., average, standard deviation; see Supporting Information Table S1). In the freshwater realm, we summed the precipitation across each subcatchment and routed the upstream accumulated precipitation along the hydrography to mimic run-off (Domisch, Amatulli, & Jetz, 2015).
From a large set of layers per case study, we selected those that are meaningful from an ecological perspective to describe the distri- for all predictors used). All (continuous) predictors were centred (so all predictors have a mean of 0) and scaled by dividing by their standard deviations. The final number of predictors was 13, 11 and 8 for SDMs in the terrestrial, marine and freshwater realms, respectively.

| Defining spatial connectivity
Regarding spatial SDMs, the connectivity among planning units for the terrestrial and marine realms was computed from a polygon shapefile using the R-package sPdeP (using the queen's move; Bivand, Hauke, & Kossowski, 2013;Bivand & Piras, 2015). This procedure identifies all first-order neighbours of each focal planning unit, that is, those planning units that are directly adjacent and connected.
In the freshwater realm, it is crucial to account for the longitudinal connectivity among planning units (Abell, Allan, & Lehner, 2007;Hermoso, Linke, Prenda, & Possingham, 2011). Hence, the spatial connectivity was defined as all upstream subcatchments connected within a 100 km (as-the-fish-swims) distance of a given subcatchment. This distance was chosen as a trade-off between hydrological connectivity (which can exceed 100 km) and the computational demand and time for calculating SDMs and spatial plans. Note that with this information, full connectivity matrices (range-wide and up/ downstream) are built within SDMs and the spatial prioritization. In other words, each planning unit "knows" which planning units are connected to it.

| Detecting spatial autocorrelation
We first assessed the degree of spatial autocorrelation in species occurrences among planning units (where occurrences were aggregated within planning units). Here, we used the Monte Carlo simulation method and calculated Moran's I and the associated p-value for each species (at α = 0.05), using the R-package sPdeP (with 999 permutations). Spatial weights were kept identical to the connectivity definition of planning unit above, specifying proportional weights to the connected planning unit (row standardization).

| Species distribution modelling
We ran Bayesian occupancy models for predicting the habitat suitability of species in each realm. Our aim was to minimize the methodological shortcomings in SDMs that could be ported to the spatial prioritization.
Hence, we refrained from artificial pseudo-absences to draw species non-detections, but used survey data with repeated visits to account for the detection probability of species at given sites. Specifically, we ran zero-inflated binomial SDMs in a Bayesian framework using the hsdm R-package (Vieilledent et al., 2014) with "hSDM.ZIB" and "hSDM. ZIB.iCAR" functions for the non-spatial and spatial SDMs, respectively.
These functions integrate two processes, a Bernoulli suitability and a Binomial observability process, into a hierarchical zero-inflated binomial model. The Bernoulli suitability process uses species point records and environmental predictors as the response and explanatory variables, respectively (Vieilledent et al., 2014): where z i is a random variable describing the binary habitat suitability at planning unit i , which follows a Bernoulli distribution of parameter θ i ; that is, the probability that the habitat is suitable in the planning unit i . θ i is expressed as a linear model combining the matrix of environmental predictors X i and parameters β using a logit link function.
In the spatial models, Equation 2 is extended by an intrinsic conditional autoregressive model (iCAR) in the suitability process: where ρ i is the spatial random effect of the planning unit i . ρ i accounts for the spatial autocorrelation of the presence probabilities variability in suitability that is not explained by the environmental variables: where μ i is the mean of ρ i in the neighbourhood of i, V ρ is the variance of the spatial random effects, and n i is the number of neighbours for the planning unit i (see also Latimer et al., 2006, for the formal description of the iCAR). Thus, we assume that in addition to the environmental characteristics within the planning units, the species' occurrence probabilities also depend on the occurrence probabilities of the neighbouring planning unit (Vieilledent et al., 2014). A planning unit with "poor" environmental conditions next to planning units with "good" environmental conditions will result in higher estimated suitability, than if its neighbours also show "poor" environmental conditions (Domisch et al., 2016).
Regarding the observability process, we counted the number of repeat surveys within each planning unit; that is, we aggregated the surveys over time, yielding the information on how often each planning unit was visited. The model then estimates the probability of observation given the species presence in a planning unit, where we assume that, if the species was observed at least once during multiple visits in a given planning unit, the habitat is suitable and the absence of the species during other visits in the planning unit is due to imperfect detection. For instance, a given species was observed once in planning units A and B. Planning unit A was visited once, yielding a detection probability of 1, whereas B was visited 10 times (yielding a detection probability of 1/10 = 0.1).
In combination with the suitability process (i.e., environmental conditions) and the spatial autocorrelation (i.e., the spatial neighbourhood information), the observability information yields more robust estimates in unsampled locations, that is, whether a species actually occurs there but was not detected, or if it is actually not present (and hence not detected). The probability of observing the species (δ ρ ) was specified as: where y i is a vector of the total number of observed presences in planning unit i . z i is the binary habitat suitability in planning unit i from the suitability process (Equation 1), and t i denotes the total number of visits within the planning unit i , including the non-detections as described above. y i follows a binomial distribution of the combination of the habitat suitability z i at planning unit i and the parameters δ i and t i . In other words, δ i is the probability of observing a species in a location, if it was actually present. We estimate only the intercept (γ 0 ) using a logit link function and assume a spatially constant detection probability (due to the relatively small occurrence datasets and limited spatial replication needed to understand the role of covariates in variable detection probabilities). (1) We ran three Markov chain Monte Carlo (MCMC) simulations with 200,000 iterations each, a burn-in phase of 50,000 iterations and a thinning interval of 10. Model convergence was assessed by the multivariate potential scale reduction factor (MPSRF ;Brooks & Gelman, 1998). For the suitability process, we used the coefficients from an initial, non-spatial generalized linear model (GLM) as initial values, and both suitability and observability processes used uninformative priors centred at zero with a fixed large variance of 100 (Domisch et al., 2016). The prior distribution for the variance of the spatial random effects followed a uniform distribution, that is, a flat prior where the upper bound of the variance is set to 10.
Species data were split into training (70%) and validation (30%) sets, and model performance for each species was evaluated using the area under curve (AUC), true skill statistic (TSS), sensitivity and specificity (true positive and negative predictions, respectively) and the deviance information criterion (DIC). Note that the non-spatial SDMs did not have any pre-defined spatial configuration during the SDM calibration and prediction; that is, each planning unit was considered independent of each other in the SDMs, while all other data were kept constant (i.e., identical random seed in the SDMs, and identical predictors and species data including the subsets for validation). The spatial prioritization was undertaken using the realmspecific spatial connectivity as described above, for predictions derived from both non-spatial and spatial SDMs.

| Spatial prioritization
We transformed the predictive posterior mean probability maps from SDMs into a semi-binary scheme using TSS as a threshold (Allouche, Tsoar, & Kadmon, 2006), where all values below the threshold were We first calibrated the boundary length modifier (BLM) for our analyses. The BLM is dimensionless and balances the spatial aggregation and patchiness of spatial plans. For the BLM calibration, we used the software marxan v.2.43v (Ball et al., 2009) within the R-package marxan (Hanson & Watts, 2015). We created all necessary input files in base R, except the "input.dat" file, where for convenience, we used the marxan R-package. We then calibrated the BLM following the recommendations by Ardron, Possingham, and Klein (2008). First, we ran marxan with a fixed species penalty factor of 10, and BLM values set to 0, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1,000, 10,000, 100,000 to approximate the range of the optimal BLM value. We plotted the connectivity against the BLM (Supporting Information Figure S6) and identified the elbow, that is, the point from where an increase in the compactness (higher BLM values) has no major effect on the connectivity in spatial plans anymore. Similarly, we also mapped the spatial plans to visually confirm the increasing compactness derived from increasing BLM values. In a second step, we repeated the previous analyses and maps with setting the BLM ranging within the approximate BLM value (e.g., between 10 and 100).
Again, we ran sensitivity runs within increments of 10 and identified the BLM to 35, 20, 0.15 for the terrestrial, marine and freshwater realms, respectively (Supporting Information Figure S6).
We then used the gurobi oPtimizer 7.5 software (Gurobi Optimization, 2017) to find optimal conservation planning solutions based on integer linear programming (ILP) within the R-package Prioritizr (Hanson et al., 2017). ILP has shown to out-compete traditional simulated annealing tools (e.g., Marxan, Ball et al., 2009) in terms of time and accuracy (Beyer, Dujardin, Watts, & Possingham, 2016), using the same input files as marxan. We varied the conservation target for all species in the spatial plans, ranging from 10% to 50% in 5% increments. For instance, a conservation target of 20% means that for each species 20% of its suitable habitat (as given by the SDM predictions) are required in this specific spatial plan.
For each run, we allowed a 10% gap to optimality in the spatial plans (as trade-off between optimality, and the time the optimizer takes to converge). To rule out the effect of the planning unit itself (cost, i.e., area of the planning unit), we set a constant value of 1 as the cost of each planning unit. Likewise, we set no weights for single species. This means that only the conservation features, that is, the probabilistic information on species habitat suitability and the connectivity penalty (set by the BLM), were decisive to the objective function that was set to minimize the total number of planning units to be part of the selected spatial plan (Game & Grantham, 2008). Note that opposed to, for example, marxan, gurobi provides one optimal solution and hence requires that all species meet the conservation target.

| Statistical analyses
For each realm and model type, we compared model accuracy given the model evaluation scores, the estimated detection probability, the range size estimates of model outputs considering all planning units that had a probability value above the threshold (equal to binary predictions), and the summed probability of habitat suitability across all species within each planning unit as a proxy for species richness (Mateo, Felicísimo, Pottier, Guisan, & Muñoz, 2012). For each conservation target run (10%-50%) derived from spatial and non-spatial SDMs, we (a) assessed whether species were covered by the spatial plans, (b) counted the relative number of planning units needed to fulfil the given target of the spatial plan and (c) compared the spatial overlap of spatial plans by assessing the percentage of overlapping spatial units.

| Spatial autocorrelation
Across the terrestrial, marine and freshwater realms, all but three of the 171 modelled species showed a significantly positive spatial autocorrelation (α > 0.05 for the freshwater fish species

| Model performance
The SDM performance indicators AUC, TSS, sensitivity and specificity were on average consistently higher for the spatial SDMs than for the counter-parts (Figure 2). DIC was on average lower for the spatial SDMs in the terrestrial and freshwater realm, but higher in the marine realm (marine non-spatial vs. spatial SDMs; DIC: 515 ± 485 vs. 534 ± 596, mean ± SD). All chains of the SDMs converged successfully yielding a MPSRF value of 0.999. The detection probability across realms was on average consistently higher in the spatial than in the non-spatial SDMs (Figure 2f,l,r).

| Range size estimates
Spatial SDMs yielded generally more compact and less dispersed habitat suitability estimates (see exemplary maps in Supporting Information Figure S1), and this effect is also mirrored in the range size estimates. Range size estimates derived from spatial SDMs were significantly lower in the freshwater and terrestrial realms than in the non-spatial models. This means that the predictions derived from spatial SDMs were more restrictive and species' suitable habitats were predicted to occur in less planning units than in the non-spatial SDMs (see exemplary maps in Supporting Information Figure S1; non-spatial vs. spatial SDMs in the terrestrial realm: 1,278 ± 750 vs. 1,057 ± 690 planning units, paired t-test: t = 2.95, df = 32, p = 0.006; freshwater: 2,874 ± 1,491 and 2,532 ± 1,409 planning units, mean ± SD, paired t-test: t = 2.0493, df = 84, p = 0.044). In the marine realm, no significant difference in range size estimates was observed (non-spatial vs. spatial SDMs: 2,698 ± 1,564 and 2,628 ± 1,500 planning units, paired t-test: t = 0.783, df = 52, p = 0.436).
F I G U R E 2 Model evaluation scores representing AUC, TSS, sensitivity, specificity and DIC, as well as the estimated detection probability, summarized across 33 terrestrial (a-f), 53 marine (g-l) and 85 freshwater species (m-r) derived from non-spatial (blue) and spatial SDMs (green). Bars represent median values and boxes the 1st and 3rd quartiles, respectively

| Summed probabilities of habitat suitability
Across realms, spatial SDMs produced on average a lower summed habitat suitability across planning units (i.e., a proxy for species richness), than predicted by non-spatial SDMs (non-spatial vs. spatial SDMs in the terrestrial realm: 3.00 ± 1.28 vs. 2.37 ± 1.00 plan-  Figure S2 for a spatial representation of the differences in summed probabilities among nonspatial and spatial SDMs.

| Spatial similarity of spatial plans
All species targets across SDM types and spatial conservation plans were met in all spatial prioritization runs (data shown in the Pangaea repository). The degree of spatial overlap of potential protected areas derived from spatial and non-spatial SDMs was targetdependent. The overlap was lowest for the smallest conservation target (10%) with the per cent overlap ranging from 1% to 2%, and increased up to a maximum of 30% to 39% for a conservation target of 50% (Figure 3a, and Supporting Information Figures S3-S5).
The number of required planning units increased linearly with increasing conservation targets: the higher the conservation target, the higher the required amount of planning units to meet the given target ( Figure 3a).
The relative difference in the number of planning units needed for a given solution between spatial and non-spatial SDMs was within a margin of 5% (Figure 3b). No significant differences in the number of planning units between spatial and non-spatial models across conservation targets could be observed.

| D ISCUSS I ON
Incorporating connectivity has been successfully adopted in systematic conservation planning for building species migration and movement corridors (Margules & Pressey, 2000). Yet, this concept has been largely neglected in the underlying conservation features (e.g., species) that are used in the spatial planning. Our study pro- causing spatial structuring in the response, that is, the modelled species' habitat suitability (Besag, 1974). In addition, Crase, Liedloff, Vesk, Fukuda, and Wintle (2014) showed that if such induced spatial autocorrelation in species distributions is not accounted for, model predictions are likely to remain inaccurate. Contrary to Thibaud, Petitpierre, Broennimann, Davison, and Guisan (2014), who found that spatial autocorrelation itself has only a minor relative effect on model prediction accuracy (compared to the choice of predictors, sample sizes, sampling designs and modelling technique), our analyses reveal that accounting for spatial autocorrelation leads to a substantial improvement of model outputs (given that all other data were kept constant).
This effect cascades into the resulting spatial plans: Those derived from spatial SDM predictions are likely to provide a better representation of areas needed to potentially protect, than spatial conservation plans derived from non-spatial SDM predictions. In addition, spatial SDMs yielded, on average, a slightly higher detection probability and thus reduced the potential false absences caused by the imperfect detection of species (MacKenzie et al., 2017). Reducing the risk of overlooking species is therefore crucial for an effective conservation planning (Petitot, Manceau, Geniez, & Besnard, 2014).
We found only a minor overlap of potential protected areas based on spatial and non-spatial SDMs for the commonly applied conservation targets of 10%-20% (e.g., the global target of 17% set by the Convention on Biological Diversity, CBD, 2010;Veach, Minin, Pouzols, & Moilanen, 2017). Interestingly, spatial plans, regardless of the modelling approach, required a similar amount of planning units. On average, spatial SDMs predicted a lower range size per species and, therefore, a lower number of species predicted to have a suitable habitat per planning unit. Hence, spatial plans using predictions from spatial SDMs require-relatively speaking-more planning units (e.g., assuming that non-spatial and spatial SDMs would predict a suitable habitat of a species into 100 and 90 planning units, respectively, the spatial plans should cover theoretically 20 and 18 planning units given a target of 20%). This is likely to derive due to the, on average, lower summed probability of suitable habitats per planning unit in spatial SDM predictions (i.e., species richness sensu Mateo et al., 2012). Consequently, the optimizer (gurobi, in our case) had to choose from a smaller pool of available planning units. Hence, the optimizer is forced to seek additional planning units that lead to a remarkably different spatial arrangement of spatial conservation plans (Figures 1 and 3a). While we did not explicitly test for differences among realms, this pattern was similar across different landscape configurations and planning unit shapes. Broadening the perspective, non-spatial SDMs tend to create artificial species "hot spots" by predicting a higher species richness per planning unit, and the derived spatial plans would take advantage of the umbrella effect, that is, where the conservation of a species confers protection to a larger number of co-occurring species (Roberge & Angelstam, 2004). In our study, however, this pattern does not depict the best-possible outcome given the lower model evaluation scores. In turn, spatially explicit SDMs have the potential, though not reducing the quantity of protected areas, to increase their quality (Barnes et al., 2018).
We acknowledge that a 10% gap to optimality, as we used it here, can introduce noise in the spatial arrangement of potential protected areas (e.g., Figure 3a). Even with this noise, however, differences in spatial plans were distinct in our study. While a more stringent gap (e.g., 0.5%, Beyer et al., 2016) could yield more accurate spatial plans, we intended to balance uncertainties in the spatial plans with those most probably apparent in the underlying data (note that due to computational reasons, we only extracted the mean posterior probability of SDMs). In other words, species and environmental data, as well as the aggregation of data into planning units, potentially lead to a certain level of uncertainty in the habitat suitability predictions in our study (Beale & Lennon, 2012). Hence, spatial plans should be given flexibility to accommodate such uncertainties in terms of the selection of spatial units during the spatial prioritization.
We singled out the effect of spatial autocorrelation in conservation features to explore the effect of the connectivity among planning units on the subsequent spatial conservation planning. We encourage future studies to further explore these effects using "real-world" exercises, where the current and already established protected areas could be locked-in during the spatial prioritization along with various costs for the planning units. This would decrease the flexibility in the selection of protected areas, and given the low overlap between nonspatial and spatial conservation features, the impact on the outcome remains to be seen. Similarly, the degree of fragmentation (Cabeza, 2003) and the minimum patch size depend on the available area and financial resources (as a large amount of small patches are more expensive to manage (Smith, Minin, Linke, Segan, & Possingham, 2010)).
Likewise, the implementation of various cost measures, such as area, land price or human influence, could be added in the analyses-a topic that was however beyond the scope of our study. Hence, testing spatial plans and with these additions and interactions has the potential to further reveal differences stemming from non-spatial and spatially explicit modelled conservation features. In our study, the species data were contingent on publicly available survey data across the three realms, and hence, the species can be considered generalist species. Rare species are considered particularly important for spatial planning; however, such data were not publicly available for our study. It remains therefore to be seen whether SDMs of rare species would yield similar differences in spatial plans as shown in our study.
We note that by accounting for species' observability by employing a hierarchical SDM, we aimed to limit the shortcomings of SDMs that else do not account for the fact that species might be overlooked during the sampling. Regarding the addition of the spatial information in SDMs, we want to highlight that instead of running hierarchical Bayesian models, other SDM techniques could be equally used, such as Generalized Linear Mixed Models using spatial random effects, or other SDMs algorithms that use spatial eigenvectors as a covariate containing information about the spatial structure in the study area (see Dormann, 2007, for a review on various methods).

| CON CLUS IONS
Employing SDMs that account for spatial dependencies in conservation features provide a promising way forward to increase the quality of protected areas, however without increasing the area (and possibly costs) needed. They yield fundamentally different spatial conservation plans given more accurate species habitat suitability predictions, compared to SDMs that ignore such spatial dependencies. This highlights the importance of using best-possible modelling practices sensu Dormann (2007) and Guisan et al. (2013), which is needed for a wider acceptance of SDMs in systematic conservation planning (Tulloch et al., 2016). Hence, we encourage modellers and practitioners to carefully assess the possibilities in adding spatial SDMs to their systematic conservation planning to derive sufficient options for choosing the optimal spatial arrangement of protected areas.

ACK N OWLED G EM ENTS
We wish to thank the following data providers for making their data LN1320A). We wish to thank Gwen Iacona and two anonymous referees for their constructive comments on an earlier version of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare no competing financial interests.

DATA ACCE SS I B I LIT Y
All results supporting the study are available at https://doi.pangaea.  Fithian et al. (2015). Species data are available for download in the following repositories: Supporting Information of Fithian et al.

B I OS K E TCH
Sami Domisch is an aquatic ecologist who is interested in freshwater biodiversity and biogeography. His research is based on spatial freshwater biodiversity science and biodiversity informatics, and how these tools can be further developed to yield robust estimates regarding freshwater biodiversity and conservation planning. The authors consist of a multidisciplinary team with a wide range of research interests, such as freshwater biodiversity, spatial modelling, river restoration, floodplain ecology and systematic conservation planning.
Author contributions: SD, MF, SCJ and SDL designed the study; SD performed the modelling and spatial prioritization; and SD, MF and SDL analysed output data. TH, FP and AW collected and prepared data. SD, MF and SDL wrote the first draft of the manuscript, and all authors contributed substantially to revisions.

S U PP O RTI N G I N FO R M ATI O N
Additional supporting information may be found online in the Supporting Information section at the end of the article.