## Introduction

Methodological advances in species distribution modelling have been rapid (Guisan & Zimmermann 2000; Scott *et al*. 2002). While the practical and intellectual benefits of obtaining well-tested models for species’ distributions are numerous, including forecasting species’ range shifts from climate change (Thomas *et al*. 2004) and invasion by introduced species (Peterson 2003; Drake & Bossenbroek 2004), testing evolutionary hypotheses (Graham *et al*. 2004), identifying reservoirs for disease (Peterson *et al*. 2002), and planning for conservation in a dynamic landscape (Ferrier 2002), modelling species’ niches is complicated by conceptual and technical difficulties and by data limitations (Guisan & Thuiller 2005). Recent advances in machine-learning techniques for statistical pattern recognition might be used to overcome many of these obstacles, which generally result from assumptions about the statistical distribution of data or restrictive parametric modelling paradigms. We studied the accuracy and reliability of ecological niche models built with support vector machines (SVM) for estimating the support of a statistical distribution (Schölkopf *et al*. 2001; Tax 2001; Tax & Duin 2004). We show that the SVM framework performs comparably or is superior to other methods with only moderate amounts of data while avoiding common problems and limitations.

The most common obstacles to conventional parametric and non-parametric statistical methods for modelling species’ distributions are: (i) autocorrelated observations resulting from the inherent spatial distribution of ecological systems, spatial autocorrelation in species’ actual distributions, and haphazard rather than designed sampling; and (ii) observations only of species’ occurrences without complementary observations of species’ absences. Autocorrelated observations result in inflated *P*-values for hypothesis testing when modelling techniques are based on parametric statistics, and have the potential to introduce bias in estimated models. One approach to this problem in a parametric setting is to add to a generalized linear model (GLM; e.g. logistic model) terms to model the spatial correlation (Augustin, Mugglestone & Buckland 1996; He, Zhou & Zhu 2003). Other studies have taken a similar approach with semi-parametric regression techniques, such as generalized additive models (GAM; Leathwick & Austin 2001). However, these methods place further demands on already sparse data and extrapolate poorly.

Strictly speaking, the second obstacle, lack of data confirming species’ absences, renders modelling approaches based on classification/discrimination impossible (Robertson, Caithness & Villet 2001; Hirzel *et al*. 2002). Previous studies have sought to overcome this problem by simulating observations of species’ absences (sometimes called pseudo-absences) from data domains in which there are no observations of species’ occurrences (Engler, Guisan & Rechsteiner 2004). While remarkably robust models have been developed using this approach (Anderson, Lew & Peterson 2003), a method that does not rely on such heuristics would be useful. Further, it is not clear that these procedures can be used in a setting that is not already information rich, where background knowledge of species’ ecologies can guide modelling heuristics (Anderson, Lew & Peterson 2003), although these are precisely the cases where species distribution models are most useful, for instance for forecasting species invasions or range shifts from climate change. Finally, classification models fitted to simulated data are generally ecologically uninformative or cumbersome to interpret (Keating & Cherry 2004). The aim of this study was to introduce a technique that overcomes these obstacles.

A promising alternative to conventional classification-based species distribution models is to use methods designed for modelling one type of data only (Robertson, Caithness & Villet 2001; Hirzel *et al*. 2002; Brotons *et al*. 2004; Phillips, Dudík & Schapire 2004). Many such techniques may be found in the literature on statistical pattern recognition, where a frequent goal is to separate statistical outliers from observations drawn from a high-dimensional distribution (Schölkopf *et al*. 2001; Tax 2001; Tax & Duin 2004). Indeed, rather than estimating the full probability distribution, in such situations it may be simpler (and more robust) to model just the support of the distribution, the set of points where the (unknown) probability density is greater than zero (Schölkopf *et al*. 2001). Sometimes support estimation is called one-class classification (Tax 2001). While many different methods for estimating statistical distributions might be optimized for one-class classification (Tax 2001; Tax & Duin 2004), methods based on SVM have been particularly successful in applications where data represent a large set of variables (Tax 2001, table 4·2; Tax & Duin 2004). SVM use a functional relationship known as a kernel to map data onto a new hyperspace in which complicated patterns can be more simply represented (Müller *et al*. 2001). The choice of kernel is typically based on theoretical properties, while any kernel parameters are optimized using computational techniques such as cross-validation. Because SVM are not based on characteristics of statistical distributions there is no theoretical requirement for observed data to be independent, thereby overcoming the problem of autocorrelated observations, although model performance will be affected by how well the observed data represent the range of environmental variables. Further, SVM are more stable, require less model tuning, and have fewer parameters than other computational optimization methods such as neural networks (Lusk, Guthery & DeMaso 2002). Finally, computational complexity is minimal and standard algorithms can be used for optimization. Thus, implementation is straightforward in familiar scientific computing environments such as R (http://www.r-project.org/, accessed 16 February 2006) and MATLAB (Mathworks Inc., Natick, MA). In contrast to genetic algorithms (Stockwell & Peters 1999; Drake & Bossenbroek 2004), the solution is deterministic, resulting in both faster computation and repeatable results. Thus, the potential gains from using support vector machines for ecological niche modelling are great, including reliable and accurate forecasting, feasible computation and a high level of ecological interpretability (Guo, Kelly & Graham 2005).