Aim Various statistical techniques have been used to model species probabilities of occurrence in response to environmental conditions. This paper provides a comprehensive assessment of methods and investigates whether errors in model predictions are associated to specific kinds of geographical and environmental distributions of species.
Location Portugal, Western Europe.
Methods Probabilities of occurrence for 44 species of amphibians and reptiles in Portugal were modelled using seven modelling techniques: Gower metric, Ecological Niche Factor Analysis, classification trees, neural networks, generalized linear models, generalized additive models and spatial interpolators. Generalized linear and additive models were constructed with and without a term accounting for spatial autocorrelation. Model performance was measured using two methods: sensitivity and Kappa index. Species were grouped according to their spatial (area of occupancy and extent of occurrence) and environmental (marginality and tolerance) distributions. Two-way comparison tests were performed to detect significant interactions between models and species groups.
Results Interaction between model and species groups was significant for both sensitivity and Kappa index. This indicates that model performance varied for species with different geographical and environmental distributions. Artificial neural networks performed generally better, immediately followed by generalized additive models including a covariate term for spatial autocorrelation. Non-parametric methods were preferred to parametric approaches, especially when modelling distributions of species with a greater area of occupancy, a larger extent of occurrence, lower marginality and higher tolerance.
Main conclusions This is a first attempt to relate performance of modelling techniques with species spatial and environmental distributions. Results indicate a strong relationship between model performance and the kinds of species distributions being modelled. Some methods performed generally better, but no method was superior in all circumstances. A suggestion is made that choice of the appropriate method should be contingent on the goals and kinds of distributions being modelled.