## Introduction

Species distribution models (SDMs, Elith & Leathwick 2009) are of fundamental importance to many aspects of biological and ecological sciences as well as to environmental management. SDMs quantify the relationship between the environment and a species' distribution. The environment is quantified using spatial climate variables, such as maximum/minimum temperature, temperature in warmest month, amongst many others (Soria-Auza *et al*. 2010). These variables are often obtained by querying GIS data bases. Example uses of a SDM are to predict a species' distribution of a study region (Pearson & Dawson 2003), or to project potential change in distribution under climate change scenarios (Forester, DeChaine & Bunn 2013; Wenger *et al*. 2013).

Most spatial climate data sets in use today have been developed using one of several interpolation techniques, which represent a mixture of general numerical methods and specific models. These include the following: inverse-distance weighting (Matheron 1971; Isaaks & Srivastava 1989); various forms of kriging (Phillips, Dolph & Marks 1992; Dodson & Marks 1997); tri-variate splines (Wahba & Wendelberger 1980; Cressie 2003; Hijmans *et al*. 2005; Xu & Hutchinson 2012); local regression (Daly 2006); and regional regression models (Goodale, Aber & Ollinger 1998; Johansson & Chen 2005; Ashcroft & Gollan 2012). These spatial climate data sets are estimates (or predictions) of the true spatial climate and are therefore subject to uncertainty, which itself can also have spatial structure with some regions consistently overestimated and others consistently underestimated (Fernández, Hamilton & Kueppers 2013). In this article, we use PRISM (Parameter–elevation Relationships on Independent Slopes Model) as an illustrative example. PRISM is a weighted, local regression technique that accounts for physiographic factors affecting spatial climate variations, and has been used extensively in the United States, Europe and Asia (Daly, Neilson & Phillips 1994; Daly *et al*. 2002; Daly, Helmer & Quinones 2003; Daly *et al*. 2008; Bishop & Beier 2013).

Even if the uncertainty arising from spatial climate variables can be estimated, there remain questions about how this information can be used in SDMs. Can uncertainty in climate variables be incorporated? If so, how? What happens if the uncertainty is ignored? What is the type of change in predictions and/or inference expected if uncertainty is incorporated? How might extrapolation (for example a changed climate) behave under an uncertain model? This paper sets out to answer these questions.

Accounting for uncertainty in explanatory variables (through what is commonly referred to as *measurement error* models or *errors-in-variables* models) is a well-known and important topic in many applied fields, such as engineering and medical studies (Fuller 1987; Carroll *et al*. 2006). Uncertainty in explanatory variables has two main implications: bias in estimates of regression coefficients, and a loss of power (to determine whether explanatory variables are important), which combined, Carroll *et al*. (2006) refer to as the ‘double whammy’. Generally, more uncertainty in the explanatory variables induces more bias in the estimates of the model's parameters, which can have adverse consequences for model predictions too. Errors-in-variables models aim to avoid the ‘double whammy’ using one of a variety of statistical methods (Carroll *et al*. 2006). In order for these methods to be applicable, some *known* information on the uncertainty in the explanatory variables is required (e.g. the variance) which is usually obtained from the measuring device/procedure/model, or some validation data set, or from repeated measures. However, it is critical that we specify the type of underlying error in the explanatory variables. In section ‘Classical vs. Berkson Errors’, we discuss two common types (classical and Berkson errors) in greater detail and highlight their implications for SDMs.

In the SDM context, several attempts have been made to either examine or account for uncertainty in spatial climate variables – for example: Elston *et al*. (1997) proposed an adjustment in regression coefficients; Foster, Shimadzu & Darnell (2012) used errors-in-variables models to account for explanatory variables that are overly smooth; Denham, Falk & Mengersen (2011) considered a conditional independence model in a hierarchical Bayesian framework using a Gibbs sampler where uncertainty in the explanatory variables was accounted for using a validation data set; McInerny & Purves (2011) investigated uncertainty in explanatory variables attributed to fine-scale environmental variation, and proposed a general correction for regression dilution (or attenuation) also based on Bayesian methods; Fernández, Hamilton & Kueppers (2013) examined the influence of interannual variability, topographic heterogeneity and the distance to nearest weather station; and Hefley *et al*. (2014) investigated the presence of location uncertainty in presence-only data.

We use two statistical errors-in-variables methods: (i) hierarchical modelling and (ii) simulation–extrapolation (SIMEX) – both of which are well developed. In contrast to the existing approaches (those referenced above), our presented methodology differs from (and complement) in the assumptions made about the underlying prediction process. We present a case study where estimates of uncertainty in temperature variables are available, via the PRISM software (Daly *et al*. 2008), and we relate them to the species distribution of the Carolina wren *Thryothorus ludovicianus* in the United States. Additionally, we present simulation studies to investigate bias, efficiency and statistical power, and look at how well SDMs predict and project to new scenarios when prediction error is both ignored and accounted for.