A new method for dealing with residual spatial autocorrelation in species distribution models


B. Crase, School of Botany, The Univ. of Melbourne, Parkville, VIC 3010, Australia. E-mail: beth.crase@nt.gov.au


Species distribution modelling (SDM) is a widely used tool and has many applications in ecology and conservation biology. Spatial autocorrelation (SAC), a pattern in which observations are related to one another by their geographic distance, is common in georeferenced ecological data. SAC in the residuals of SDMs violates the ‘independent errors’ assumption required to justify the use of statistical models in modelling species’ distributions. The autologistic modelling approach accounts for SAC by including an additional term (the autocovariate) representing the similarity between the value of the response variable at a location and neighbouring locations. However, autologistic models have been found to introduce bias in the estimation of parameters describing the influence of explanatory variables on habitat occupancy. To address this problem we developed an extension to the autologistic approach by calculating the autocovariate on SAC in residuals (the RAC approach). Performance of the new approach was tested on simulated data with a known spatial structure and on strongly autocorrelated mangrove species’ distribution data collected in northern Australia. The RAC approach was implemented as generalized linear models (GLMs) and boosted regression tree (BRT) models. We found that the BRT models with only environmental explanatory variables can account for some SAC, but applying the standard autologistic or RAC approaches further reduced SAC in model residuals and substantially improved model predictive performance. The RAC approach showed stronger inferential performance than the standard autologistic approach, as parameter estimates were more accurate and statistically significant variables were accurately identified. The new RAC approach presented here has the potential to account for spatial autocorrelation while maintaining strong predictive and inferential performance, and can be implemented across a range of modelling approaches.