### Introduction

- Top of page
- Abstract
- Introduction
- Case Study
- Results
- Discussion
- Conclusions
- Acknowledgements
- References
- Supporting Information

Correlative species distribution models (SDMs) are used to predict how changes in climate will affect the spatial configuration of suitable habitat for a given species, allowing projections to be made under a variety of scenarios (Elith & Leathwick, 2009). SDMs are increasingly used for conservation planning and climate adaptation applications such as assisted migration and identifying locations suitable for reserves (Pearce & Lindenmayer, 1998; Araújo *et al*., 2004; Vitt *et al*., 2009; Carroll *et al*., 2010). Sound decisions require careful consideration of the uncertainties inherent in these projections (Burgman *et al*., 2005; Rocchini *et al*., 2011), yet the uncertainty associated with SDM projections, although acknowledged to be large, is poorly understood and rarely considered in applications (Elith *et al*., 2002; Dormann, 2007a). Reasons for this include methodological issues and a lack of temporally independent data for projection validation (Dobrowski *et al*., 2011). Repeated calls have been made for maps of uncertainty to be presented with results (Elith *et al*., 2002; Burgman *et al*., 2005; Rocchini *et al*., 2011), and their absence has led some to question the utility of SDMs for conservation planning (Heikkinen *et al*., 2006; Dormann, 2007a). In this study we assess the ability of a spatial regression SDM method to provide a useful characterization of projection uncertainty.

The uncertainty of SDM projections is difficult to quantify given the range of contributing sources (Elith & Leathwick, 2009). Studies have shown that the choice of modelling technique introduces the greatest amount of variability in projections (Araújo *et al*., 2005; Pearson *et al*., 2006; Buisson *et al*., 2010). This has led to the use of ‘ensemble’ methods, in which numerous models are fit using a range of methods and input data (Araújo *et al*., 2005). Outcomes are averaged and those consistent between fitted models are deemed more reliable than those for which the models do not agree. A lack of consensus within an ensemble qualitatively suggests uncertainty, but the reasons that methods disagree are poorly understood (Burgman *et al*., 2005).

Issues related to spatial autocorrelation (SAC) may partially explain inconsistency between methods in SDM projections. SAC arises because observations close in geographic space are generally more similar than those further apart. When a model is unable to fully explain the spatial pattern of a species' distribution, residual errors will exhibit this property, violating a key assumption of the statistical methods underlying most SDM approaches. SAC of residual error has been shown to be very common in SDM applications (Dormann, 2007b) and can easily be introduced if important covariates are missing or if a species exhibits spatial aggregation due to biotic factors. Most SDM methods are incapable of accounting for this type of error, since they consider only sampling variability and its resultant effect on the precision of parameter estimates. Although SAC has been shown not to bias parameter estimates, it has been shown to decrease their precision and lead to biased variance estimates, inflating tests of significance and thus biasing model selection procedures (Lennon, 2000; Dormann *et al*., 2007; Beale *et al*., 2010). This model misspecification may partially explain the disagreement between SDM methods and has been hypothesized to reduce their transferability through space and time (Randin *et al*., 2006). Numerous methods have been proposed to correct for the adverse effects of spatial autocorrelation on SDMs (Dormann, 2007). Generally, the focus of this research has been on methods to improve parameter estimates and tests of significance (Dormann, 2007a), and less on assessing the transferability of these models and accurately estimating projection uncertainty.

Several notable attempts have been made to quantify SDM prediction uncertainty. Buckland & Elston (1993) demonstrate a non-parametric bootstrapping approach in which numerous models are fit to permutations of the original data, resulting in maps indicating the proportion of iterations the species was predicted to be present. Hartley *et al*. (2006) present a Bayesian model averaging approach to estimating uncertainty. They fit a set of plausible models containing different covariates and calculate uncertainty by combining between-model and within-model variability. While these approaches provide a quantitative representation of uncertainty, they do not consider the bias induced by SAC on model selection and are unable to account for uncertainty due to important covariates not considered. Other authors have presented maps of uncertainty using Bayesian spatial regression approaches (Clements *et al*., 2006; Latimer *et al*., 2006; Finley *et al*., 2009a), but we are unaware of previous attempts to validate estimates of projection uncertainty using temporally independent data.

Generalized linear mixed models (GLMMs) extend generalized linear models (GLMs) to include random effects capable of accounting for additional sources of uncertainty. To account for SAC, this random effect can be specified as a spatially structured random intercept, or spatial process term, interpreted as the effects of unobserved processes with spatial structure (Diggle *et al*., 1998). The spatially-structured random intercept has intuitive appeal in that it is able to represent the greater confidence we feel in finding a species when closer to a known presence location. The variance–covariance parameters of the random intercept control the magnitude, range and smoothness of the dependence in space, and are estimated during the model-fitting process. This avoids subjective modelling choices regarding the zone of spatial influence and allows its effect to be integrated into both parameter estimates and predictions. Spatial process GLMMs can be fit through the use of Bayesian hierarchical methods and Markov chain Monte Carlo (MCMC) techniques (Banerjee *et al*., 2004). Although computationally intensive, this methodology provides full access to the distributions of the model's parameters given the data, i.e., posterior distributions, and the posterior predictive distributions of the response variable at unobserved locations and/or times. Latimer *et al*. (2009) and Finley *et al*. (2009a) have explored some of the utility of spatial process GLMMs (hereafter referred to as GLMMs) to model species distributions, but their projections have yet to be validated against temporally independent data. If validation shows that GLMMs are able to account for the uncertainties in modelling species distributions through time, realistic mapping of uncertainty and statistical inference on predicted range changes should be possible.

In this study we compare the ability of GLMMs with a spatially structured random intercept and non-spatial GLMs to project species distributions. We fit a suite of models for historical observations of 99 woody plant species from California, USA, and use contemporary data to assess the accuracy of projections of these models and their ability to characterize projection uncertainty through time.