Spatial autocorrelation and the selection of simultaneous autoregressive models

Authors

  • W. Daniel Kissling,

    Corresponding author
    1. Community & Macroecology Group, Institute of Zoology, Department of Ecology, Johannes Gutenberg University of Mainz, D-55099 Mainz, Germany,
    2. Virtual Institute Macroecology, Theodor-Lieser-Str. 4, 06120 Halle, Germany
      *Correspondence: W. Daniel Kissling, Community & Macroecology Group, Institute of Zoology, Department of Ecology, Johannes Gutenberg University of Mainz, D-55099 Mainz, Germany.
      E-mail: kissling@uni-mainz.de
    Search for more papers by this author
  • Gudrun Carl

    1. UFZ - Helmholtz Centre for Environmental Research, Department of Community Ecology, Theodor-Lieser-Str. 4, 06120 Halle, Germany,
    2. Virtual Institute Macroecology, Theodor-Lieser-Str. 4, 06120 Halle, Germany
    Search for more papers by this author

*Correspondence: W. Daniel Kissling, Community & Macroecology Group, Institute of Zoology, Department of Ecology, Johannes Gutenberg University of Mainz, D-55099 Mainz, Germany.
E-mail: kissling@uni-mainz.de

ABSTRACT

Aim  Spatial autocorrelation is a frequent phenomenon in ecological data and can affect estimates of model coefficients and inference from statistical models. Here, we test the performance of three different simultaneous autoregressive (SAR) model types (spatial error = SARerr, lagged = SARlag and mixed = SARmix) and common ordinary least squares (OLS) regression when accounting for spatial autocorrelation in species distribution data using four artificial data sets with known (but different) spatial autocorrelation structures.

Methods  We evaluate the performance of SAR models by examining spatial patterns in model residuals (with correlograms and residual maps), by comparing model parameter estimates with true values, and by assessing their type I error control with calibration curves. We calculate a total of 3240 SAR models and illustrate how the best models [in terms of minimum residual spatial autocorrelation (minRSA), maximum model fit (R2), or Akaike information criterion (AIC)] can be identified using model selection procedures.

Results  Our study shows that the performance of SAR models depends on model specification (i.e. model type, neighbourhood distance, coding styles of spatial weights matrices) and on the kind of spatial autocorrelation present. SAR model parameter estimates might not be more precise than those from OLS regressions in all cases. SARerr models were the most reliable SAR models and performed well in all cases (independent of the kind of spatial autocorrelation induced and whether models were selected by minRSA, R2 or AIC), whereas OLS, SARlag and SARmix models showed weak type I error control and/or unpredictable biases in parameter estimates.

Main conclusions  SARerr models are recommended for use when dealing with spatially autocorrelated species distribution data. SARlag and SARmix might not always give better estimates of model coefficients than OLS, and can thus generate bias. Other spatial modelling techniques should be assessed comprehensively to test their predictive performance and accuracy for biogeographical and macroecological research.

Ancillary