Consequences of spatial autocorrelation for niche-based models

Authors


Pedro Segurado, Unidade de Macroecologia e Conservação, Universidade Évora, Estrada dos Leões, Antiga Fábrica das Massas Leões, 7000–730 Évora, Portugal (e-mail psegurado@uevora.pt).

Summary

  • 1Spatial autocorrelation is an important source of bias in most spatial analyses. We explored the bias introduced by spatial autocorrelation on the explanatory and predictive power of species’ distribution models, and make recommendations for dealing with the problem.
  • 2Analyses were based on the distribution of two species of freshwater turtle and two virtual species with simulated spatial structures within two equally sized areas located on the Iberian Peninsula. Sequential permutations of environmental variables were used to generate predictor variables that retained the spatial structure of the original variables. Univariate models of species’ distributions using generalized linear models (GLM), generalized additive models (GAM) and classification tree analysis (CTA) were fitted for each variable permutation. Variation of accuracy measures with spatial autocorrelation of the original predictor variables, as measured by Moran's I, was analysed and compared between models. The effects of systematic subsampling of the data set and the inclusion of a contagion term to deal with spatial autocorrelation in models were assessed with projections made with GLM, as it was with this method that estimates of significance based on randomizations were obtained.
  • 3Spatial autocorrelation was shown to represent a serious problem for niche-based species’ distribution models. Significance values were found to be inflated up to 90-fold.
  • 4In general, GAM and CTA performed better than GLM, although all three methods were vulnerable to the effects of spatial autocorrelation.
  • 5The procedures utilized to reduce the effects of spatial autocorrelation had varying degrees of success. Subsampling was partially effective in avoiding the inflation effect, whereas the inclusion of a contagion term fully eliminated or even overcompensated for this effect. Direct estimation of probability using variable simulations was effective, yet seemed to show some residual spatial autocorrelation effects.
  • 6Synthesis and applications. Given the expected inflation in the estimates of significance when analysing spatially autocorrelated variables, these need to be adjusted. The reliability and value of niche-based distribution models for management and other applied ecology purposes can be improved if certain techniques and procedures, such as the null model approach recommended in this study, are implemented during the model-building process.

Introduction

Niche-based models are familiar tools used to explain and predict species’ spatial distributions (Walker 1990; Pereira & Itami 1991; Austin et al. 1996; Manel et al. 1999). Recently, niche models have received increased attention, in part because of the need to predict species’ range shifts under future climate-change scenarios (Guisan & Theurillat 2000; Midgley et al. 2002; Peterson et al. 2002; Thomas et al. 2004; Thuiller et al. 2005; Araújo, Thuiller & Pearson 2006). However, there are a number of unresolved methodological issues requiring further enquiry. One example is the problem of non-independence between data used for calibration of models and that used for validation (Araújo et al. 2005a). Non-independence is often the result of using spatially autocorrelated data to calibrate and validate the models, and one of the consequences is that the perceived ability of models to make realistic predictions in space (Randin et al. 2006) and time (Araújo et al. 2005a) may be inflated. This problem may be greater than previously anticipated, as illustrated by studies showing high levels of intermodel variability in projections of species’ range shifts under climate change scenarios (Thuiller 2004; Thuiller et al. 2004; Araújo et al. 2005b; Araújo, Thuiller & Pearson 2006). We have addressed the biases in model predictions that arise from using different procedures of model adjustment and validation under varying levels of spatial dependencies among predictor and response variables.

Sample size is a crucial parameter in the outcome of classical hypothesis testing as it determines the necessary degrees of freedom for pattern detection. In spatial analyses the simple count of sample units is not always an adequate estimator of effective sample size. For example, if the values of a variable depend on the distance between sample points, a set of closely spaced observations effectively provides less information than the same number of observations more widely separated in space. Such spatial dependency between values is termed spatial autocorrelation (SA; Cliff & Ord 1973) and its causes and consequences have been the focus of much research (Legendre 1993; Koening & Knops 1998; Lennon 2000; Dale & Fortin 2002; Fortin & Payette 2002; Legendre et al. 2002). SA leads to an overestimation of the effective sample size (leading to pseudoreplication), inflating the statistical significance of measured spatial relationships and consequently increasing the likelihood of type I errors (false positives). There is a serious possibility that previous analyses that used correlative approaches might be flawed because of ‘red herrings’ generated by SA (Lennon 2000), with both the estimated predictive power and the choice of variables being seriously biased. Because of the potential importance of such biases, many methods have been developed to help account for SA within models, including a priori procedures at the level of sampling design (Harrison 1997; Legendre et al. 2002; Legendre et al. 2004), modifications at the level of model adjustment (Keitt et al. 2002; Lichstein et al. 2002) and a posteriori procedures, such as the use of correction factors, to improve statistical accuracy of models (Dutilleul 1993; Legendre et al. 2002).

Niche-based models use several alternative techniques to summarize relationships between species occurrences and environmental variation (Guisan & Zimmermann 2000; Segurado & Araújo 2004), usually in the context of spatial and temporal predictions. Other designations for this family of models can be found in the literature, such as habitat models (Guisan & Zimmermann 2000), species distribution models (Olden, Jackson & Peres-Neto 2002), bioclimatic envelope models (Pearson & Dawson 2003) and presence/absence models (Fielding & Bell 1997). Although authors acknowledge the importance of SA, they often disregard or minimize the extent to which the presence of spatially autocorrelated data might affect the explanatory power and predictive accuracy of models (i.e. the ability in which models calibrated in one set represent observations in an independent set; for discussion see Araújo et al. 2005a; Randin et al. 2006). This could be a serious shortcoming in models as species’ occurrences tend to be aggregated at most spatial scales, and the more aggregated species’ occurrences are, the more likely it is that environmental variables will show some explanatory power simply because of the fact that environmental conditions tend to be more similar at neighbouring sites. Indeed, the strength of the correlation between variables has been shown to be increasingly more pronounced as SA grows stronger, whereas unbiased correlations are produced when at least one variable exhibits no SA (Lennon 2000).

Model generalization (e.g. variable selection in stepwise logistic regression, pruning of classification trees and stopping rules in artificial neural networks) are common procedures to avoid overadjustments to calibration data and are designed to increase the predictive power of models (Franklin 1998; Pearce & Ferrier 2000; Thuiller 2003). In the case of regression-based techniques this procedure implies that an assessment of the explanatory power of variables is made. The problem is that the inflation of explanatory power for spatially autocorrelated variables makes them, a priori, disproportionately likely to be selected in the final models. This is made at the expense of selecting potentially more important variables with lower SA. Therefore, variable selection procedures can be an additional source of bias in model fitting.

Patterns of species’ distributions may be spatially autocorrelated because of contagious population dynamics and historical factors, but they may also be the result of spatial structure among environmental predictors (Storch et al. 2003). In fact, species and the environment may share spatial structure because of the effect of spatially structured environmental predictors and non-environmental contingencies that may or may not be related amongst them (Borcard, Legendre & Drapeau 1992). If part of the spatial structure in the species’ data is shared by the environmental data, knowing the relative weight of each item that contributes to the observed spatial structure is an important challenge when testing causal hypotheses to data (Borcard, Legendre & Drapeau 1992; Storch et al. 2003). A common procedure to cancel the effect of spatial structure of species’ occurrences is to incorporate a term for SA into the analysis (Smith 1994; Augustin, Mugglestone & Buckland 1996; Araújo & Williams 2000; Keitt et al. 2002; Segurado & Araújo 2004), usually a measure of contagion that encompasses the effect of spatial neighbourhood in the statistical test.

An alternative procedure to avoid pseudoreplication is to subsample the original species’ distribution data, usually by adopting a systematic scheme that constrains observations to be spaced far enough from each other (Gates et al. 1994; Brito, Crespo & Paulo 1999; Guisan & Theurillat 2000). This method has the disadvantage of not using all the available information and thereby artificially limiting sample size, a procedure that may have serious consequences for the predictive performance of models (Araújo et al. 2005b).

The effect of SA on correlation and linear regression significance values has been tested elsewhere using artificially generated variables with known spatial structures (Lennon 2000). In the context of niche-based modelling the extent to which SA in the response and predictor variables influences model performance is poorly known. In particular, the effect of SA in the validation data set used to estimate predictive power of models has never been assessed. In this study the effects of SA in species’ distribution models were quantified using a null model approach with test variables generated from randomizations of spatially structured environmental data. First, the overall effect of SA on measures of model performance was assessed. Secondly, we explored whether different modelling techniques, differing in the response functions used, were equally sensitive to the effect of SA. We also quantified the effect of spatial autocorrelation in the predictive ability of covariates entering models of species’ distributions using both resubstitution and data set partitioning (Olden, Jackson & Peres-Neto 2002; Araújo et al. 2005a). Finally, the effectiveness of different approaches to reduce undesirable effects of SA was evaluated.

Materials and methods

data

We used the distributions of two species of freshwater turtles, the Mediterranean pond turtle Mauremys leprosa (Schweiger, 1812) and the European pond turtle Emys orbicularis (Linnaeus, 1758), in two equally sized rectangular areas on the Iberian Peninsula (Fig. 1). In each rectangular area, data on species’ distributions were located in 66 × 21 universal transverse mercator (UTM) 10 × 10-km grid cells. The main criteria for delimitation of the two rectangular areas were to maximize (i) the geographical extent of the rectangles and (ii) the geographical distance between them, in order to ensure the greatest feasible amount of information while retaining high levels of spatial independence between rectangles. Distribution data were compiled from four different sources: the updated atlas of the Portuguese herpetofauna (Godinho et al. 1999), the atlas and Red Data Book of the amphibian and reptiles of Spain (Keller & Andreu 2002; da Silva 2002), the UNIBA database (Alentejo's Biodiversity Database Unit; http://www.cea.uevora.pt/umc, 2006) and P. Segurado (unpublished data). The spatial structure of the two species’ distributions in the study region is distinct: E. orbicularis is found in fewer grid cells and it is widely scattered compared with M. leprosa, which has more clumped distributions. There are also regional differences in occupancy patterns: in the western rectangle occurrences of M. leprosa are more densely distributed and the occurrences of E. orbicularis are slightly more scattered than in the eastern rectangle. Simulated distributions of two species with distinctive spatial structures were also generated from the M. leprosa database creating a random distribution and a clumped distribution. The random distribution was obtained by randomly assigning the position of species’ presences, while the clumped distribution was obtained by joining all species’ presences into a single contiguous block. The species’ distributions, as well as the simulated distributions, were used as response variables in the analyses.

Figure 1.

Location of the rectangular areas used in the analysis (10-km linear resolution).

Environmental data included climatic and topographic information (Table 1) resampled at the same grid resolution as the species’ occurrence data. There is a danger that species might respond indirectly to topography, which would limit the models’ predictive powers. However, topographic variables have the potential to summarize important surrogate predictor variables, such as habitat availability for freshwater turtles, that are not captured by the available variables. For example, both species show a preference for still and slow-moving water habitats and therefore their distributions might respond directly to slope. Climate data included 11 variables compiled from point data with 10-minute resolution (New, Hulme & Jones 2000). A randomly generated predictor variable showing negligible SA was also included in the analyses. The two rectangular areas did not differ considerably in their environmental range, although the western rectangle included a slightly wider gradient range for some variables.

Table 1.  Variables included in the analysis, Moran's I-values for the original variables and mean Moran's I-values for the 1000 toroidal permutations (SD values inside parentheses)
VariableEastern areaWestern area
Original Moran's IMean Moran's IOriginal Moran's IMean Moran's I
Mauremys leprosa 0·416 0·543
Emys orbicularis 0·386 0·264
Species with clumped distribution 0·959 0·975
Species with a random distribution−0·016 0·009
Predictor variables
Altitude, mean (Alt) 0·8170·764 (0·012) 0·882 0·819 (0·019)
Altitude, minimum (Altmin) 0·8430·796 (0·013) 0·847 0·775 (0·024)
Altitude, maximum (Altmax)0·7840·739 (0·011) 0·889 0·837 (0·015)
Slope, mean (Slope)0·7330·729 (0·011) 0·832 0·787 (0·012)
Hill shade, mean (Hill)0·2920·284 (0·007) 0·181 0·171 (0·007)
Hill shade, minimum (Hillmin)0·6570·658 (0·008) 0·752 0·716 (0·013)
Hill shade, maximum (Hillmax)0·6260·623 (0·006) 0·716 0·681 (0·012)
Mean annual temperature (Tann)0·9730·943 (0·016) 0·981 0·937 (0·013)
Mean temperature of the coldest month (Mtc)0·9450·929 (0·014) 0·971 0·909 (0·015)
Mean temperature of the warmer month (Mtw)0·9870·952 (0·016) 0·981 0·943 (0·014)
Mean annual growing degree days (Gdd)0·9750·944 (0·017) 0·983 0·938 (0·012)
Mean annual global net radiation (Rann)0·9570·906 (0·022) 0·958 0·861 (0·016)
Mean annual evapotranspiration/potential evapotranspiration (A2P)0·9500·915 (0·017)  0·975 0·914 (0·011)
Mean annual precipitation sum (Pann)0·9010·877 (0·023) 0·968 0·908 (0·023)
Mean winter precipitation sum (Pwin)0·8950·890 (0·021) 0·966 0·898 (0·025)
Mean summer precipitation sum (Psum)0·9240·879 (0·020) 0·971 0·932 (0·016)
Mean spring precipitation sum (Pspr)0·9160·872 (0·018) 0·966 0·914 (0·020)
Mean autumn precipitation sum (Paut)0·8880·879 (0·021) 0·968 0·900 (0·024)
Random environment0·0010·000 (0·003)−0·018−0·014 (0·003)

pattern generation

The effect of SA was evaluated by generating simulated patterns with known and fixed spatial structures. There are two main categories of such pattern generation: (i) fully synthetic patterns, generated purely from mathematical principles such as the method based on the inverse discrete Fourier transform (Lennon 2000); and (ii) patterns generated from real data using restricted or sequential permutations of real patterns (Fortin & Jacquez 2000) or using more elaborate approaches such as the random patterns implemented by Roxburgh & Chesson (1998) and the patch model proposed by Watkins & Wilson (1992).

In this study we employed sequential permutations of environmental variables based on toroidal shifts (Palmer & Van der Maarel 1995; Fortin & Jacquez 2000; Dale & Fortin 2002; Fortin & Payette 2002; Storch et al. 2003) to generate patterns from the original environmental variables. With this randomization technique, coordinates of the original variable are moved by a common random factor in every geographical direction; cells that are shifted beyond one side of the range of real coordinates are moved to the opposite side of the range. This randomization assures that the main spatial structure is maintained. This technique is more straightforward, easier to implement and computationally more efficient than alternative methods. Its main drawback is that it can create unrealistic environmental patterns with abrupt orthogonal lines originated by the shifted edges. We assumed that this feature would not have an effect in the analysis because the existence of linear edges in species’ distributions is unlikely. As toroidal shifts can be too liberal (Fortin & Jacquez 2000), an image reflection and a 180-degree rotation were initially performed for each variable. This procedure ensured that even toroids involving only small shifts would differ substantially from the original pattern, thus making the pattern simulation more conservative.

Moran's I statistics was used to estimate general patterns of spatial dependency of variables. In order to evaluate the degree to which the original spatial structure was maintained after randomizations, 1000 toroidal shifts were run and for each permutation the Moran's I was calculated. The distribution of the resulting values was compared with the Moran's I-value of the original variables.

assessing the effect of spatial autocorrelation

Three modelling techniques, differing in their ability to model complex response shapes, were used to relate species’ distributions to each of the 1000 toroidal shifts of the environmental variables: (i) generalized linear models, (ii) generalized additive models and (iii) classification tree analysis. Generalized linear models (GLM; McCullagh & Nelder 1983) are generalizations of the classical linear regression allowing error distributions other than the normal distribution; here a binomial error distribution was assumed (logistic regression). Generalized additive models (GAM; Hastie & Tibshirani 1990) are semi-parametric forms of GLM that use smooth functions instead of the usual regression coefficients. GAM were fitted using cubic splines as the smooth function and assuming a binomial error distribution. Classification tree analysis (CTA; Clark & Pregibon 1992) is a non-parametric technique that is based on recursive partitions of the dimensional space defined by the predictor variables into groups that are as homogeneous as possible for the response variable. We used a recursive algorithm that successively splits the data into binary branches by choosing the splits that cause the maximum reduction of the residual deviance.

For each permutation, the model classification accuracy was measured by calculating the receiver operational characteristic (ROC) curve and summing the area under that curve (AUC; Fielding & Bell 1997). The AUC assesses whether model predictions differ from that expected by chance, varying from 0·5 (random classification) to 1 (perfect classification). GLM performance was also measured using the likelihood ratio test statistics (LRS; Hosmer & Lemeshow 1989), which correspond to the reduction of model residual deviance in relation to null model deviance.

Models were calibrated on both rectangles and model accuracy was measured using the whole calibration set (i.e. resubstitution). Accuracy (AUC) of GLM was also measured by application to the second rectangular area, which was interpreted as providing an independent validation. Explanatory power was expressed by measures of model accuracy using the calibration set, while predictive power was expressed by measures of model accuracy using the validation set.

The overall effect of SA was assessed exploring the relationship between the 95th percentile and the standard deviation of measures of accuracy for each null pattern set with Moran's I of the original predictor variables. The variation of the 95th percentile was analysed because it represents a common threshold in most statistical hypothesis testing. This parameter is expected to be inflated by SA.

In GLM, the LRS is assumed to be chi-square distributed and therefore it is also possible, for each run of 1000 permutations, to calculate the number of LRS tests on the calibration set that are found to have values above the expected number according to a chosen type I significance level. The ratio between this number and the expected number of significant tests according to the significance level (e.g. 50 out of 1000 for P= 0·05) expresses the type I inflation ratio (Lennon 2000). Here, a significance level of 0·01 was considered in order to compare with other results found in the literature.

All data analyses were performed with 5-PLUS 2000 (Statistical Sciences 1999) using the default functions for model adjustments. Random toroidal shifts were performed using a modified function from the Splancs library of 5-PLUS (Rowlingson & Diggle 1993).

dealing with spatial autocorrelation

Two methods that address the effects of SA were compared using toroidal shifts and the procedures described above. The first method consisted in subsampling the original data set by eliminating cells in a systematic manner. All cells with even coordinates were eliminated from the original data sets. The second method included the incorporation of an autocovariate term accounting for the SA of observations. Autologistic models use contagion as an autocovariate term in the logistic regression equation. The measure of contagion was based on a two-order neighbourhood as the weighted average of the number of occupied grid cells among a set of ka neighbours of a central grid cell ya, so that:

image( eqn 1)

where the weight given to the grid cell yb is wab = 1/dab, and dab is the distance between grid cells ya and yb. Two orders of neighbours, assigning a weight of d= 1 to the first-order and a weight of d= 2 to the second-order neighbours, were used. Neighbours in the first order were the eight adjacent cells touching the central cell along the edges and at the corners within a rectangular grid. The second-order neighbours were the next group of cells concentric to the first order with 16 grid cells.

These two methods were assessed for their ability to produce unbiased estimates of the explanatory power as measured by LRS. The contribution of contagion terms was removed from the LRS estimates, in order to assess exclusively the contribution of each variable to the explanatory power. This assessment was based on the distribution of M. leprosa only. The predictive power of models was not assessed using the autologistic approach because the spatial autocovariate term is a function of the calibration data and cannot be predicted for independent validation data.

Finally, corrected univariate variable significances were estimated for distribution models of M. leprosa and E. orbicularis. A Monte Carlo simulation approach was used to compare the AUC computed using the original predictor variables with the test statistics generated from 1000 toroidal shifts. Significance (probability of rejecting a true null hypothesis) was defined as the fraction of the AUC statistics of the 1000 simulated variables that fell above the AUC statistics of the original variable.

Results

spatial autocorrelation of variables

Spatial dependency is stronger for M. leprosa than for E. orbicularis, as shown by the overall SA of variables measured by the Moran's I index (Table 1). In the western rectangle the SA of M. leprosa occurrence is more pronounced while that of E. orbicularis is less pronounced in the western than in the eastern rectangle. Among environmental variables, the topographical descriptors are generally less spatially autocorrelated than climatic variables. Hill shade variables have the lowest Moran's I-values. For most environmental variables, with the exception for mean hill shade and mean temperature of the warmest month, the western rectangle shows slightly higher SA values than the eastern rectangle.

The mean Moran's I-values for each set of 1000 toroidal shifts of the environmental variables tend to be slightly lower than the original Moran's I, probably because of the effect of the shifted edges, which can slightly disturb the spatial structure. However, toroidal displacement did not shift significantly the relative order of SA between variables. On the other hand, the variability of values among the 1000 simulations, as measured by the standard deviation, was reasonably low, representing a small fraction of the mean values (SD/mean ranging from 0·010 to 0·042), which means that spatial structure was maintained among permutations (Table 1).

effect on the explanatory power of variables

Amongst sets of 1000 toroidal permutations the 95th percentile of the model accuracy measures increased as the SA of the predictor variables increased (Fig. 2). This effect was more pronounced for species’ distributions that were more spatially autocorrelated and was eliminated in a simulated random species distribution. The 95th percentile of the AUC distribution of the 1000 toroidal shifts per variable increased in an approximately linear fashion with SA for each modelling approach (Fig. 2). In the western section of the studied rectangle the variation of AUC values with SA of predictor variables was more pronounced than in the eastern rectangle for M. leprosa and less pronounced for E. orbicularis (Fig. 2).

Figure 2.

Variation of the 95th percentile of AUC with Moran's I of environmental variables using three modelling techniques (GLM, GAM and CTA), four distributions (M. leprosa, E. orbicularis, a simulated distribution with totally clumped occurrences and a random distribution) and 1000 simulated surfaces (lines represent linear fits).

There were slight differences on the effect of SA for GAM and CTA compared with GLM, as shown by the variation of the 95th percentile of AUC values with Moran's I of the environmental variables (Fig. 2). The variation of AUC 95th percentiles was less marked for GAM and CTA than for GLM models; this was particularly evident for species with more autocorrelated distributions. This weaker variation was because of the better performances of GAM and CTA compared with GLM models for species’ distributions and predictor variables with reduced SA while showing comparable performances for those with higher SA (Fig. 2). CTA tended to perform better than GAM, especially for the less autocorrelated distributions. These results were consistent between rectangles.

The 95th percentile of the LRS distribution of GLM models using variable permutations was more sensitive than AUC to the increase of Moran's I. This parameter tended to increase in an exponential fashion with SA (Fig. 3).

Figure 3.

Relationship between spatial autocorrelation and the 95th percentiles of LRS statistics for GLM, regressing each simulated surface with presence/absence data of M. leprosa, E. orbicularis, a simulated distribution with totally clumped occurrences and a random distribution.

effect on the predictive power of variables

The effect of the increase in Moran's I using an independent validation data set to evaluate the predictive power of GLM models was as strong as using the calibration set. The 95th percentile of deviance also tended to increase with SA in a linear form (Fig. 4). This trend was more marked for models of M. leprosa calibrated in the eastern rectangle and for models of E. orbicularis calibrated in the western rectangle.

Figure 4.

Model validation: relationship between Moran's I for the validation data set and the 95th percentiles of M. leprosa model's AUC, for each set of 1000 permutations of the original variables. Lines represent linear fits. (a) M. leprosa, eastern rectangle model validated with the western rectangle data set; (b) M. leprosa, western rectangle model validated with the eastern rectangle data set; (c) E. orbicularis, eastern rectangle model validated with the western rectangle data set; (d) E. orbicularis, western rectangle model validated with the eastern rectangle data set.

inflation ratios

Inflation ratio increased with Moran's I-values (Fig. 5) of the environmental variables, especially at lower Moran's I-values. Thus there was a substantial inflation of the apparent predictive power of analyses using variables with even a modest amount of SA. Predictor variable significance could be inflated by a factor up to 90 for the clumped distribution. At higher Moran's I-values, inflation ratios tended to stabilize (Fig. 5). Inflation ratios also increased where species’ distributions displayed increased SA; indeed, in the absence of SA in the distribution (Fig. 5, Random distributions) even highly autocorrelated environments did not cause inflated predictive power estimates. Overall, the higher the level of SA in both species’ distributions and environmental variables used, the higher the inflation ratios. Consequently, it was unsurprising to find that inflation ratios for M. leprosa were more pronounced in the western rectangle (mean inflation ratio of 75·3 vs. 60·2 for the eastern rectangle) while those for E. orbicularis were more pronounced in the eastern rectangle (mean inflation ratio of 40·4 vs. 38·3 for the eastern rectangle). The observation that random distributions did not suffer variations in the inflation ratio supported the interpretation that it was SA rather than alternative unmeasured factors that caused inflation of the significance estimates in the models.

Figure 5.

Variation of the inflation ratio (number of times a significant test was found in relation to the expected number of significant tests, according to the significance level adopted, in this case P < 0·01, with the Moran's I-values of the original variables). Dashed lines are spline fits. (a) Eastern rectangle; (b) western rectangle.

dealing with spatial autocorrelation

Comparison of two methods for dealing with SA within models showed that including an autocovariate term in the regression was more effective than systematically subsampling the area (Fig. 6). Indeed, the autologistic procedure may slightly overcompensate. For example, in the eastern rectangle there was a slight decrease in the location measure of the LRS test significances for an increase of Moran's I (Fig. 6a). The subsampling procedure was only partially effective in avoiding the inflation effect of model performance because of SA. The increase of LRS test significance with an increase of Moran's I was less pronounced using this procedure (Fig. 6; note that the axes have different ranges).

Figure 6.

Relationship between spatial autocorrelation and the 95th percentiles of the LRS tests generated by regressing each simulated surface with presence/absence data of M. leprosa. Comparison between results using the original M. leprosa data on presence/absence and two different procedures usually adopted to avoid effects of spatial autocorrelation: subsampling of the original data set and forcing the inclusion of a contagion term in the model. Lines represent exponential fits. (a) Eastern rectangle; (b) western rectangle.

When surface permutations were used to produce Monte Carlo significance values of AUC accounting for SA, no obvious relationship with SA was observed, which suggested that unbiased estimates were produced. However, the most significant variables had consistently higher Moran's I-values, especially for M. leprosa (Table 2). Variables tended to have greater explanatory power for M. leprosa than for E. orbicularis and in the western rectangle there were more variables with significant (P < 0·05) effects (e.g. for GLM 16 significant variables for M. leprosa and 13 for E. orbicularis) than there were in the eastern rectangle (11 significant variables for M. leprosa and two for E. orbicularis). The majority of significant variables for M. leprosa distribution were common to the three modelling approaches. In the eastern rectangle nine significant variables were common and two uncommon, while in the western rectangle 15 significant variables were common and three uncommon. For M. leprosa seven significant variables were simultaneously common to both rectangles and the three modelling approaches. For E. orbicularis in the eastern rectangle, only one significant variable was common to the three modelling approaches and three variables were uncommon, while in the western rectangle five significant variables were common and eight uncommon. No significant variables were simultaneously common to both rectangles and the three modelling approaches for E. orbicularis.

Table 2.  Significance of each predictor variable using a Monte Carlo simulation approach (*P < 0·05; **P < 0·01). Variables are displayed in a increasing order of autocorrelation as measured by Moran's
VariablesMauremys leprosaEmys orbicularis
GLMGAMTREEGLMGAMTREE
(a) Eastern rectangle
Random0·7530·5960·9360·5820·9860·977
Hill0·013*0·2010·2830·004**0·011*0·044*
Hillmax0·7170·9280·9370·2830·4180·028*
Hillmin0·9590·8690·9220·9970·3690·003**
Slope0·9880·8520·7950·6130·4120·552
Altmax0·1530·3130·3190·4540·7730·847
Alt0·017*0·0590·0770·1600·4190·548
Altmin0·018*< 0·0010·012*0·1640·1370·389
Paut0·9570·3570·3510·9820·1570·072
Pwin0·9400·4120·5570·9880·1220·273
Pann0·1850·4180·3660·7240·6980·791
Pspr< 0·001< 0·001< 0·0010·1340·3200·440
Psum< 0·001< 0·001< 0·0010·0770·2890·358
Mtc< 0·001< 0·001< 0·0010·1070·3680·638
A2P0·005**0·018*0·048*0·2350·6780·436
Rann< 0·001< 0·0010·008*0·1400·7900·460
Tann< 0·0010·003**< 0·0010·0910·1730·340
Gdd< 0·001< 0·001< 0·0010·0900·1490·289
Mtw< 0·0010·001**0·013*0·042*0·1910·421
(b) Western rectangle
Random0·4260·3830·2040·8820·9820·982
Hill0·013*0·022*0·049*0·4100·0910·551
Hillmax0·036*0·033*0·023*0·044*0·035*0·242
Hillmin0·013*0·008**0·008**0·004**0·021*0·019*
Slope0·015*0·020*0·041*0·034*0·0700·033*
Altmin0·1090·0500·048*0·5510·8400·857
Alt0·0680·041*0·0620·2850·3330·289
Altmax0·028*0·017*0·009**0·1460·1590·132
Rann< 0·001< 0·001< 0·001< 0·001< 0·001< 0·001
Pwin0·001**< 0·001< 0·0010·011*0·006**0·018*
Pspr0·002**0·002**0·010*0·008**0·024*0·010*
Paut0·009**< 0·001< 0·0010·040*0·027*0·037*
Pann0·007**0·007**< 0·0010·012*0·0530·113
Psum0·005**0·006**0·002**0·020*0·2310·289
Mtc0·045*0·1100·0780·2520·4590·568
A2P0·005**0·001**< 0·0010·006**0·0710·038*
Tann0·006**0·004**< 0·0010·012*0·033*0·078
Mtw< 0·001< 0·001< 0·001< 0·0010·0600·001**
Gdd0·002**0·001**0·012*0·006**0·029*0·169

Discussion

In this study we used a simple and straightforward method of pattern simulation to explore the effect of SA on niche-based models. The results reinforce the idea that conclusions from niche-based models could be compromised because of the autocorrelated nature of both predictor variables and species’ occurrences (Lennon 2000; Hampe 2004).

When plotted against a measure of SA (Moran's I), model performance showed clear trends. For example, we found that significant log-likelihood ratio tests in GLM need to be exponentially larger as SA of predictor variables are stronger. This means that even slight changes in the degree of SA have a strong effect on the probability of a predictor variable being chosen as significant. The same trend was observed with measures of models’ predictive accuracy (here AUC), although this measure was less sensitive to SA on both the response and predictor variables.

Results also suggest that sensitivities of model accuracy (AUC) to SA using either validation or calibration data sets are nearly identical. Model performances are commonly measured using either resubstitution or by splitting data into calibration and validation sets (Olden, Jackson & Peres-Neto 2002; Araújo et al. 2005a). The last approach does not fully avoid an overestimation of model accuracy because two sources of bias may arise: (i) SA between the calibration and validation data sets and (ii) SA within data sets. In fact, even when using more complex approaches such as cross-validation, bootstrapping and jack-knifing (Guisan & Zimmermann 2000), most procedures involve splitting the original data set into calibration and validation subsets using random assignments. These two subsets have inevitably some degree of non-independence because of SA (Araújo et al. 2005a). In this study we used two data sets that were spatially separated in order to avoid an overestimation of accuracy because of a lack of independence between data sets. This allowed an assessment of the role of SA within each data set when predicting distributions in other regions, i.e. its effect on model transferability (Randin et al. in press). Despite differences in the environments of two rectangles, results suggest that the models’ transferability depends greatly on the SA of the environmental variables.

As demonstrated before with correlation and linear regression analysis (Lennon 2000), SA of covariates inflates their statistical significance in predictive models. In the present study, the maximum inflation ratio was 94 for the clumped distribution, i.e. variables were found to be significant 94 times more often than expected by chance. Inflation ratios reached 89 for M. leprosa and 54 for E. orbicularis. These inflation values are similar to those estimated by Lennon (2000) using synthetically generated spatial patterns.

The two a priori procedures used to minimize the effect of SA partially cancelled (systematic subsampling) or fully cancelled (inclusion of the contagion term) the effect of SA. However, it should be noted that testing for relationships between environmental variables and species’ occurrences after forcing the inclusion of a contagion term might represent a problem. This procedure cancels the SA of species’ distributions without differentiating environmental from demographic/historical contributions. By cancelling the environmental contribution to the SA, models tend to underestimate the importance of environmental variables that co-vary with species’ occurrences (Araújo & Williams 2000). Variables that enter in models with an autocovariate term are most likely to explain the non-autocorrelated aspects of patterns, which may represent only a minor fraction of total variance. Moreover, such a model would most probably lack predictive power outside the calibration set or under future climate-change scenarios. Indeed, models that incorporate a term for SA cannot be extrapolated to regions where no occurrence data are available. Hence, the inclusion of an autocovariate term is only possible when projections are made within the geographical range of the calibration data set.

When sample size is not a limiting factor, subsampling of the original data matrix is a possibility. Even though it does not completely eliminate the inflation effect, it reduces it substantially. This can be done by simply selecting samples in a systematic manner as done here, or by using geostatistical tools such as variogram and correlogram plots (Maurer 1994; Diniz-Filho, Bini & Hawkins 2003), to analyse the overall pattern of spatial dependency and to help establish a minimum distance between samples that will reduce SA at a given amount (Catry et al. 2003). None the less, as populations and environmental variables tend to be autocorrelated at all scales, it seems likely that the spacing out of samples will never fully eliminate SA effects.

Another type of procedure is to account a posteriori for SA using a Monte Carlo approach, such as that used here, to estimate variable significance based on null spatial patterns. However, even when P-values are estimated using this approach, variables that exhibit higher SA tend to be more significant. Nevertheless, these are likely to be the better candidates as actual causative factors of species’ distributions; they are more likely to contribute to the autocorrelated nature of species’ patterns of occurrence.

The use of semi-parametric modelling techniques, such as GAM, and non-parametric techniques, such as classification trees, slightly reduces the effect of SA in our analyses. These techniques place fewer constraints on the shape of species’ responses to their environments, and stronger adjustments to predictions are consequently produced in comparison with parametric techniques, such as GLM. Our results suggest that this is particularly evident for response or environmental variables that display only moderate levels of SA. Thus these techniques are slightly less sensitive to variations in the variables’ patterns of SA than traditional parametric approaches such as GLM.

The results described above, and thus the recommendations given for dealing with SA, apply only to univariate modelling. When there is more than one candidate variable to explain a species’ distribution, the assessment of the effect of SA on models requires a more complex and thorough approach. Indeed, the impact of SA on multivariate model building is an urgent issue for future investigation. The first step of multivariate modelling is variable selection. Methods based on stepwise variable selection are still widely used by ecologists although it has been demonstrated that automated model building procedures can result in the selection of a subset of predictor variables with no direct effect on the response variable (Derksen & Keselman 1992). Nevertheless, even if SA inflates variable significance this does not mean that the final model configuration will exclusively include the most autocorrelated variables. If an autocorrelated variable is included in the model, it may explain a substantial fraction of the SA in the species’ pattern of occurrence. If that is the case the remaining variables may then be related to the less autocorrelated aspects of the distribution, probably driven by factors that act at finer spatial scales (Diniz-Filho, Bini & Hawkins 2003).

Some recommendations for the variable selection process can, however, be drawn from our results. Variable selection based on simulated patterns is not cost effective. We recommend, instead, before model adjustment running univariate tests based on null patterns such as the procedure described in this study. Strong autocorrelated variables that loose explanatory power when their significance is adjusted for SA should be handled with special care because they could inflate the models’ significance. An alternative would be simply to exclude such variables from the analysis. Employing methods based on the recent information–theoretic paradigm (Burnham & Anderson 2002; but see Stephens et al. 2005) during model building, such as variable selection procedures using the Akaike information criterion (Akaike 1974), is always preferable because it does not fully rely on significance thresholds. For modelling species’ distributions that show a high degree of SA it might be adequate to use semi-parametric or non-parametric techniques, as well as avoiding the use of statistics such as LRS, which are bound to be more sensitive to SA than other accuracy measures (e.g. AUC).

The value of niche-based distribution models for planning and management purposes greatly depends on their ability to overcome different sources of biases that are inherent with biological data. SA is just one among other sources of bias, yet probably the most challenging one. It is our belief that simple procedures such as the ones discussed in this study would help to enhance ecological reliability of models and therefore to increase their applied value during the decision-making process.

Acknowledgements

We thank Stephen Roxburgh, Ana Bio and Wilfried Thuiller for their comments on the manuscript. Pedro Segurado was funded by FCT (SFRH/BD/8493/2002). This research is partly funded by the EC Integrated FP6 ALARM (GOCE-CT-2003-506675) project.

Ancillary