Population density but not stability can be predicted from species distribution models

Authors


Correspondence author. E-mail: toliver@ceh.ac.uk

Summary

1. Species distribution models (SDMs) are increasingly used in applied conservation biology, yet the predictive ability of these models is often tested only on detection/non-detection data. The probability of long-term population persistence, however, depends not only upon patch occupancy but upon more fundamental population parameters such as mean population density and stability over time.

2. Here, we test estimated probability of occurrence scores generated from SDMs built using species occupancy data against independent empirical data on population density and stability for 20 bird and butterfly species across 1941 sites over 15 years. We devised a measure of population stability over time which was independent of mean density and time-series duration, yet positively correlated with risk of local extinction. This may be a useful surrogate measure of population persistence for use in applied conservation.

3. We found that probability of occurrence scores were significantly positively correlated with mean population density for both butterflies and birds. In contrast, probability of occurrence scores were at best weakly positively correlated with population stability. Referring to established ecological theory, we discuss why SDMs may be appropriate for predicting population density but not stability.

4.Synthesis and applications. Species distribution models are often constructed using species occupancy data because, for the majority of species and regions, these are the best data available. The models are then often used for projecting species’ distributions in the future and identifying areas where management could be targeted to improve species’ prospects. However, our results suggest that an overreliance on these SDMs may result in an exclusive focus on landscape management approaches that promote patch occupancy and density, but may overlook features important for long-term population persistence such as population stability. Other landscape metrics that take into account habitat heterogeneity or configuration may be required to predict population stability. To understand species persistence under rapid environmental change, count data from standardised monitoring schemes are an invaluable resource. These data provide additional insights into the factors affecting species’ extinction risks, which cannot easily be inferred from species’ occupancy data.

Introduction

Identifying species ‘niches’, the environmental conditions in which they are found, and using these to model future changes in populations is an approach that is increasingly used to inform conservation policy (Vos et al. 2008). To understand how species are likely to respond to environmental change, correlative statistical models based on empirical data on species distributions have been developed to predict which areas will be climatically suitable for species according to future environmental scenarios (‘species distribution models’ or ‘bioclimate’ models; Guisan & Zimmermann 2000). The aim of all these modelling endeavours is to identify (i) species at high risk from environmental change and (ii) areas that are particularly valuable to species under altered climates, that is, areas of current range that will remain climatically suitable or new areas that will benefit from improvement to enable the species to more fully realise their potential ranges.

The standard procedure for species distribution modelling is that statistical correlations are identified between species detection/non-detection (‘presence-absence data’ where detection is imperfect; MacKenzie et al. 2006) and various land cover and climate explanatory variables, including complex interactions between explanatory variables. A number of methods exist to build these models. For example, regression models, such as generalised additive models (GAM), and machine-learning techniques, such as artificial neural networks (ANN), random forests (RF) and maximum entropy (MAXENT) are all techniques that perform well in controlled comparisons. Next, after building the statistical model, predictions are made of the probability of species occurrence given the specific combination of explanatory variable values at each location. These predictions are validated using the test data set to give a final measure of the goodness-of-fit of the model (e.g. AUC scores, Kappa statistic). Models with sufficient amount and quality of input data generally perform well, although if results are extrapolated outside the spatial or temporal range of the training data then there are caveats regarding species dispersal limitations, local adaptation, biotic interactions etc. (Soberón & Nakamura 2009).

To date, species distribution models (SDMs) have mostly been constructed using categorical detection/non-detection data and have thus only been able to project likely changes in species distributions in terms of occupancy. However, it is population parameters, such as density and stability that are most useful in assessing the conservation status of species. Although a number of studies have recently used estimates of population abundance as input in SDMs (Shoo, Williams & Hero 2005; Randin et al. 2009; Wilson et al. 2010; Huntley et al. 2011; Kulhanek, Leung & Ricciardi 2011; Renwick et al. 2011; Tucker, Rebelo & Manne 2011), it is clear that, in most cases, only occupancy data are available. Whether the outputs of SDMs using occupancy data can adequately predict population parameters, such as density and stability over time, has rarely been tested.

Links have been long been drawn between occupancy and abundance across species, with species’ total range size often positively related to local abundance (Hanski 1982; Brown 1984; Gaston et al. 2000). At the interspecific level, projected reductions in the total area of suitable climate space have been found to be correlated with species’ population declines (Green et al. 2008; Gregory et al. 2009). However, intraspecific relationships between climatic suitability and population density are less well studied. Of the few studies that consider whether SDMs can predict population density, results are equivocal, with poor correlations in many cases (Pearce & Ferrier 2001; Nielsen et al. 2005; Elmendorf & Moore 2008; Jiménez-Valverde et al. 2009; but see VanDerWal et al. 2009). With multiple visits to the same sampling sites, species’ detectability and population densities can be inferred with some success (Royle & Nichols 2003; MacKenzie et al. 2006); yet SDMs rarely build in probabilities of detection, probably because repeat visits to the same site are rare with ad-hoc collected distribution data.

Moreover, the ability of SDMs to predict other aspects of population persistence has also been neglected. A few case studies have experimentally tested climatic suitability from SDMs by assessing establishment success or reproductive output (Wright et al. 2006; Elmendorf & Moore 2008; Willis et al. 2009). If SDMs are to be used in applied conservation (e.g. for prioritising conservation actions in different areas), then we need more confidence that projected probability of occurrence surfaces are associated with fundamentally important measures of population persistence, rather than simply species’ presence or absence.

Two correlates of population persistence are mean density and inter-annual variability. Larger populations are known to suffer from lower extinction risk, whilst smaller populations are more vulnerable to both demographic and environmental stochasticity driving them to local extinction (Pimm, Jones & Diamond 1988). In addition, there is theoretical and empirical evidence that populations that are more variable over time also suffer greater extinction risk (Pimm, Jones & Diamond 1988; Lande 1993; Inchausti & Halley 2003). Hence, the IUCN Red List criteria for classifying species’ extinction risk include both small population size and the observation of extreme population fluctuations (Mace et al. 2008). Even for true metapopulations where the occupancy of individual patches is transient, the persistence of the overall metapopulation depends on adequate local densities being maintained and will also be facilitated by increased longevity of populations in individual patches (Hanski 1999). Therefore, understanding the landscape and climatic factors that promote population density and stability is crucial for species conservation.

With this in mind, we test whether a number of SDMs that are widely used in conservation literature can be used as a tool to predict population density and stability. If they can, these modelling frameworks may be used to prioritise areas for conservation based on their ability to host persistent populations of species. If they cannot, the conservation value of the priority areas identified by these models, as they stand, must be questioned.

In this study, we analyse population time series for 20 species of butterflies and birds across 1941 locations in Britain over a 15-year period. For each time series, we calculate mean density and a measure of population stability that is independent of density, yet still associated with extinction risk. These population parameters are then related to probability of occurrence scores generated from four commonly used SDMs built using independent species detection/non-detection data. In an ideal world, where data availability is not an issue, we might have land cover and climate data that were a perfect temporal match with our species distribution data. However, we use the best input data sources currently available and have selected commonly used modelling frameworks that perform well under standard tests. Hence, we take a pragmatic approach by asking: With the best data currently available to build SDMs, are these models fit for purpose in terms of predicting the areas where populations have the highest densities and most stable population dynamics? Therefore, a poor fit between the SDM probability of occurrence scores and these independent population parameters does not necessarily mean that SDMs will always fail to predict population persistence accurately, only that, given the input data and the modelling frameworks, predictions in this certain case are inadequate.

Materials and methods

Data Collation

The most suitable monitoring and environmental data sets available for Great Britain were collated. We aimed to test the efficacy of SDMs in predicting population density and stability for more than one bird and butterfly species, but testing on all British species was not feasible. We therefore selected 10 birds and 10 butterfly species (Table S1, Supporting information) and used the following criteria: (i) species had a reasonable geographical coverage across Britain providing a large number of monitored sites [mean 648·6 ± 136·5 (SE) sites per species], (ii) each species group had an equal proportion of species previously classed as habitat ‘specialists’ vs. ‘generalists’(butterflies, Asher et al. 2001; birds: Siriwardena et al. 1998 and authors’ judgement), (iii) each species group comprised species with a range of estimated dispersal capabilities, (iv) the species were generally easily detected during field surveys to reduce the potential for stochasticity in detection to mask population fluctuations. For each species, population time-series data were obtained from the UK Butterfly Monitoring Scheme (UKBMS) and the BTO/JNCC/RSPB Breeding Bird Survey (BBS). Full details of each scheme are available elsewhere (Pollard & Yates 1993; Risely et al. 2010), and we briefly summarise them in the Supporting information. For all monitoring sites, we noted the region in which a site occurred using a 50-km British Ordnance Survey grid, which was later used to account for spatial autocorrelation in the data.

Species detection/non-detection data for butterflies were obtained from the Butterflies for the New Millennium (BNM) Atlas, comprising species records georeferenced to the nearest 1-km position on the GB grid between 1995 and 2004 (Asher et al. 2001). Tetrad (2-km grid cell) resolution detection/non-detection data for birds were obtained from the New Atlas of Breeding Birds in Britain and Ireland (Gibbons, Reid & Chapman 1993) for 1988–1991. Our environmental layers comprised remotely sensed land cover data along with climate data for four key bioclimate variables. For climate data, we used the following variables, known to affect the distribution and physiology of butterfly and bird species: growing day degrees above 5 °C (GDD5), mean temperature of the warmest month, mean temperature of the coldest month and ratio of actual to potential evapotranspiration (Roy et al. 2001; Hill et al. 2002; Thuiller, Araújo & Lavorel 2004; Robinson, Baillie & Crick 2007). These climate data were obtained from CRU ts2.1 (Mitchell & Jones 2005) and CRU 61-90 (New, Hulme & Jones 1999) data sets and interpolated to a 10-km British Ordnance Survey grid. Climate variables for each 1-km or tetrad grid square were taken from the 10-km grid square in which they were located. The climate data comprised the years 1988, the earliest year that species distribution data were collected, to 2000, the latest year that climate data were available. Bioclimate variables were calculated annually over the entire period and the long-term average taken. For land cover data, the areas of 13 biotopes (Table S2, Supporting information) inside each 1-km or tetrad grid cell were summarised from the 25-m resolution UK LandCover 2000 map (Fuller et al. 2002).

Calculating Population Density and Stability

We used our species monitoring data to calculate the mean density and population stability over time for each species at each site. For each species, we used all UKBMS or BBS sites, which had been monitored for 10 or more years between 1994 and 2008, to minimise variation in the duration of population time series that can affect measures of stability (Arino & Pimm 1995); our time series ranged from 10 to 15 years. We also included only time series that consisted of <25% zero counts, to reduce the likelihood of including sites that were newly colonised or at which species went locally extinct during the sampling period (McArdle, Gaston & Lawton 1990; Thomas, Moss & Pollard 1994). For each site’s time series, we calculated a mean density index as the mean of annual abundance values divided by the area sampled by the monitoring route. We also calculated a metric for the stability of each population time series. Our aim was to produce a single metric that reflects the variability of populations about their long-term trajectories independent of biases caused by mean abundance, long-term trends and time-series duration (Pimm & Redfearn 1988; Lepš 1993), and which also correlates with extinction risk. Therefore, we first detrended time series by taking residuals (εdt) from the equation N = α + β1Y + β2Y2 + εdt, where N is the annual abundance index and Y is the year of the time series. A quadratic equation was used because it simply, but effectively, captured the variation in abundance associated with long-term trends over the time-series lengths considered. We next calculated the standard deviation of these residuals, to assess their variability (SDεdt). This removes the bias from long-term population trends on our measure of variability, but the measure will still be influenced by the mean abundance of the time series, often in a Power Law relationship (Taylor 1961) and also by time-series duration (Pimm & Redfearn 1988). We therefore fitted a log–log transformed model between the variability of each time series (SDεdt) and the mean abundance at each site (Ň) and time-series duration (D) and took residuals from this model, that is, log(SDεdt) = α + β1 log(Ň) + β2Ň + β3D. These residual values indicate whether, empirically from within the entire data set, given the abundance at a particular site and the duration of the time series, a population is more or less variable than expected. We define this final measure of instability as residual population variability (RPV). Regressions of RPV against mean abundance and time-series duration confirm it is completely independent of these biases (Regression coefficients < 0·001 and = 1 for all species). Finally, a feature of the RPV measure is that it is centred around zero, so changing the sign converts the variable to a measure of stability (−RPV). Note that we use the term ‘stability’ here in the sense of a lack of inter-annual variability in population density.

We tested our stability measure to confirm that it related to actual extinction risk of populations. A stability measure (−RPV) was produced for the population time series of each species at each site between the years 1994 and 2008. This was regressed against the extinction risk of populations in 2009. We identified populations as high extinction risk if they had a total abundance index of zero in 2009, that is, no individuals were recorded on the site over the entire year; this does not necessarily imply absolute local extinction, but the absence of any records at a site suggests very low population densities that are at higher risk of complete local extinction. We used a generalised linear mixed effects model with extinction risk as a binary response variable and our measure of stability (−RPV) along with log mean density as explanatory variables. ‘Species’ and ‘Site’ were included as random effects to account for non-independence in the data structure. For both butterflies and birds, there was a significantly higher risk of local extinction for less stable populations (Table 1), indicating that we had successfully produced a derived population parameter that is relevant to population persistence yet independent of mean population density.

Table 1.   Association between local extinction risk and population stability and log mean density for butterflies and birds. Monitoring on a small subset of sites was not continued into 2009, hence these sites were not analysed, slightly reducing the total sample size (total number of species–site combinations = 991 and 9442 for butterflies and birds, respectively). Results show that both larger and more stable populations are less likely to have population counts of zero in 2009
GroupPopulation parameterCoefficientSEtP
  1. RPV, residual population variability.

ButterfliesStability (−RPV)−0·900·46−1·970·049*
ButterfliesDensity (log mean)−1·030·11−9·18<0·001***
BirdsStability (−RPV)−1·100·12−9·33<0·001***
BirdsDensity (log mean)−1·500·04−41·7<0·001***

Calculating ‘Probability of Occurrence’ Scores from SDMs

The following four SDM techniques were used to predict species occurrence in the landscape around our monitoring sites: one presence–absence regression method (GAM), two presence–absence machine-learning methods (ANN, and RF) and a presence-only machine-learning method (MAXENT; see Table 2 legend for method abbreviations). The choice of techniques was motivated by (i) their widespread use within the species distribution modelling community for a variety of purposes (e.g. Dormann et al. 2008; Doswald et al. 2009; Pompe et al. 2009; Seo et al. 2009), (ii) maximising variation between techniques in terms of their methodology and the data they require (regression vs. machine-learning techniques; presence-only vs. presence–absence techniques) and (iii) their robust performance in modelling data akin to those we use here. All techniques have advantages and disadvantages, but the ones we chose can deal well with nonlinear relationships between predictors and response, and have performed well under controlled comparisons (Segurado & Araujo 2004; Elith et al. 2006). GAM techniques do not generally account for complex interactions between predictors and use an AIC model selection procedure to reduce over-fitting to input data. RFs tend to over-fit to data, which is good for interpolating missing values, but poor for extrapolating. ANNs generally require large amounts of data to parameterise adequately, and therefore may perform less well for smaller data sets. Regardless of their individual advantages or disadvantages, these methods are in active applied use, so our analyses give a valid assessment of contemporary analytical techniques.

The first three techniques were implemented using the biomod package in the program r, a commonly used tool for calibrating and testing these models (R Development Core Team, 2009; Thuiller et al. 2009). We generated GAM models with cubic-smooth splines bounded by a degree of smoothness of four for each environmental predictor; a selection procedure using the AIC criterion was used to identify the most parsimonious combination of terms (Akaike 1974).We parameterised ANN models using seven hidden units within a single hidden layer, with a weight decay equal to 0·03. We generated RF models by growing 500 trees with (total number of predictors −1) predictors randomly chosen at each node. maxent was implemented as a standalone program (Phillips, Anderson & Schapire 2006). Except for increasing the maximum number of iterations to 5000 to allow model convergence, we used the default settings, allowing relationships between explanatory and response variables to be described by linear, quadratic, product, threshold and hinge relationships.

For each species, all four SDMs were fitted using species detection/non-detection data and the environmental layers described earlier. Butterfly models were fitted using 1-km resolution gridded data, but bird detection/non-detection data were only available at tetrad resolution; hence, bird models were fitted at this resolution. Most models gave reasonable goodness-of-fit when validating predictions against the data used to build the models (mean AUC score across butterfly and bird models was 0·85 ± 0·014; Tables S3 and S4, Supporting information). Each model was then used to produce a probability of occurrence surface across Britain, that is, given the climate and land cover in any particular grid cell, the model produced an estimate of how suitable that cell is likely to be for each species. From this probability of occurrence surface, we calculated the mean of all values in grid cells within a given radius around each monitoring site, to give a measure of local probability of occurrence. Local probability of occurrence estimates were calculated at spatial scales of 2, 5 and 10-km radii around each site, as these are distances at which landscape-scale conservation might feasibly be considered. To standardise across models, values for each model at each spatial scale were centred by subtracting the mean and dividing by the standard deviation to scale to unit variance. The probability of occurrence estimates at different spatial scales were highly correlated (Pearson’s correlation coefficient for probability of occurrence estimates at different spatial scales ranged from 0·83 to 0·97). Therefore, results are qualitatively similar at the different spatial scales and here we focus only on the results at the smallest scale of 2-km radius around sites, because this scale gave the best fit to the population monitoring data.

Statistical Analysis

First, we tested whether probability of occurrence scores from SDMs could significantly predict population density or stability. Probability of occurrence estimates from the different SDMs were often strongly correlated (Pearson’s correlation coefficient > 0·8); therefore, we fitted separate statistical models, each with probability of occurrence estimates from one of the four different SDMs in turn as the explanatory variable (e.g. eqn 1). Our response variable was either log mean population density or population stability −RPV), calculated from population monitoring data as detailed earlier. Initially, data for all species and sites were fitted in the same model using a mixed modelling approach with ‘Species’, ‘Site’ and ‘50-km Region’ as random effects. The ‘50-km Region’ random effect accounted for non-independence of data from sites within the same 50-km region. A 50-km region was chosen because we previously fitted models without a Region random effect and found evidence of some weak spatial autocorrelation. From visual inspection of correlograms this was only apparent for sites <50 km apart.

image(eqn 1)

Where Ňij is the log mean population density of speciesi at sitej in 50-km Regionq and PO is the probability of occurrence at 2-km radius for speciesi around sitej estimated using a SDM. From these models, we used the magnitude of the t-values (effect size divided by standard error) to judge the strength of the association between probability of occurrence scores and population density or stability. Note that, because comparisons are of equal sample size and with the same degrees of freedom, the t-value is a valid way to compare strength of association. Next, we tested whether there were differences in relationships between probability of occurrence scores and population density and stability for birds vs. butterflies. Therefore, to the models above (i.e. one for each SDM), we added an interaction term between species group (birds or butterflies) and probability of occurrence. In many cases, there was a significant interaction, so we next fitted the models above to each species group separately. Spatial autocorrelation in the site-level residuals was tested for using spline correlograms using the ncf package in the program r (R Development Core Team 2007; Bjørnstad 2009). In no cases were significant spatial patterns in the residuals evident.

Finally, we fitted data for each species in a separate statistical model to examine variation in relationships among species. In this instance, for each species, we fitted a linear regression between either log mean density or population stability (−RPV) against the probability of occurrence scores from each SDM (i.e. eight models were fitted in total for each species). Outputs from these models indicated whether each individual species showed a significant trend between probability of occurrence from SDMs and population density or stability from monitoring data. Slope coefficients for the relationships were extracted and plotted as histograms. A one-sample Wilcoxon signed-ranks test was used to determine whether the grouped coefficients for each species group differed significantly from zero.

Results

Multispecies Models

Densities of butterfly and bird species were strongly positively correlated with probability of occurrence scores derived from all four distribution modelling methods. The GAM SDM was best for predicting population density, closely followed by the ANN technique. From an analysis with all species combined, t-values for the relationship between log mean density and probability of occurrence ranged from 38·9 to 59·5, depending on which SDM was used (Fig. 1, Table S5, Supporting information). In contrast, population stability over time was only weakly positively correlated with probability of occurrence scores. From the analysis with all species combined, t-values for the relationship between stability (−RPV) and probability of occurrence ranged from 1·01 to 4·03, depending on which SDM was used (Fig. 1, Table S5, Supporting information).

Figure 1.

 The ability of 2-km radius probability of occurrence estimates from species distribution models (SDMs) to predict population density and population stability (−RPV) for 10 bird and 10 butterfly species. For each SDM, absolute t-values are shown from the combined analysis considering all species together. The dashed line is at t = 2 and the approximate t-value where trends are significant at < 0·05 for studies with large sample size such as this (n = 12 971 data points for all species and sites combined for each SDM).

When the effect of an interaction term between species group (birds or butterflies) and probability of occurrence on the population response variables was considered, it was evident that there were often differences between butterflies and birds. For all four distribution models, the interaction term was significant (< 0·05 in all cases). Hence, we repeated our initial analyses treating bird and butterfly species separately. Both species groups showed significant positive correlations between probability of occurrence and mean density in all cases, but the strength of the association was much stronger for birds than for butterflies (Table 2, Fig. S1, Supporting information). In contrast, there were no significant interactions between the two species groups in the effect of probability of occurrence on population stability (> 0·05 in all cases). Probability of occurrence estimated from SDMs was in nearly all cases a poor, non-significant fit to population stability (Table 2, Fig. S1, Supporting information). Only the RF model, and to a lesser extent the ANN model, fitted to the bird distribution data gave a significant association with population stability. In these cases, landscapes estimated as more suitable by the SDMs hosted bird populations that were more stable. In all other cases, the associations were non-significant, but the relationship coefficients all shared the same positive sign, indicating qualitatively similar, but weaker, predictive ability (Table 2, Fig. S1, Supporting information).

Table 2.   Relationships between 2-km probability of occurrence scores produced by four different species distribution models (SDMs) and population density and stability (−RPV) for 10 bird and 10 butterfly species (total number of species–site combinations = 1542 and 11 429, respectively). The two species groups are treated separately in these analyses
Species groupSDMDensityStability
CoefficientSEtCoefficientSEt
  1. Statistically significant relationships (< 0·05) are indicated by t-values in bold font.

  2. ANN, artificial neural networks; GAM, generalised additive models; MAXENT, maximum entropy; RF, random forests; RPV, residual population variability.

ButterfliesMAXENT0·380·057·440·0040·0090·51
ButterfliesANN0·330·065·760·0080·0071·14
ButterfliesGAM1·080·147·470·0140·0180·79
ButterfliesRF0·870·099·640·0040·0140·29
BirdsMAXENT0·680·0241·120·0030·0030·92
BirdsANN0·910·0163·310·0070·0032·12
BirdsGAM1·210·0265·780·0040·0031·37
BirdsRF0·600·0142·830·0120·0034·03

Single Species Models

We also fitted each species in a separate statistical model to examine variation in relationships among species. We present results from the RF model, which was the best of the four models for predicting stability (for birds at least; Table 2). For butterflies, eight of the 10 species showed positive relationships between probability of occurrence and density, seven of which were significant (< 0·05; Fig. 2, panel a). In contrast, no species showed a significant relationship between probability of occurrence and population stability (Fig. 2, panel b). Example relationships between probability of occurrence and density or RPV are shown in Fig. 3. Full tables of species coefficients can be found in the (Tables S6 and S7, Supporting information). For birds, all species showed significant positive relationships between probability of occurrence and density (Fig. 2, panel c). In addition, all species showed positive relationships between probability of occurrence and population stability, four of which were significant (Fig. 2, panel d).

Figure 2.

 Relationships between probability of occurrence scores and population density or stability for individual bird or butterfly species. Shown are the slope coefficients between probability of occurrence scores (predicted from a random forest model for 2-km radii around sites) and population density (panels a and c) or population stability (panels b and d). Species slope coefficients that are significantly different from zero (vertical dashed line) are shown as black bars. The overall significance of the pooled coefficients in each panel are indicated by asterisks (***< 0·001; *0·01 < < 0·05; NS> 0·05).

Figure 3.

 Example relationship between probability of occurrence scores from a species distribution model (SDM) and (a) log mean population density and (b) population stability (−RPV) for the butterfly species Aricia agestis (Schiff.). This species was chosen because the relationships have approximately intermediate goodness-of-fit and the same qualitative trends generally shown across species. Probability of occurrence was estimated using a random forest SDM to predict mean probability of occurrence at 2-km radius around sites. The dashed line indicates a significant relationship (< 0·05; see Tables S6 and S7, Supporting information for summary statistics).

The goodness-of-fit of these single species regressions was better for the relationship between probability of occurrence and density (R2 values ranged from 0·02 to 0·37, mean = 0·13; Table S6, Supporting information) than for the relationship with stability (R2 values ranged from <0·001 to 0·040, mean = 0·006; Table S7, Supporting information), but was always low. Testing the significance of overall trends across species using a one-sample Wilcoxon signed-ranks test on slope coefficients gave qualitatively similar results to our multispecies mixed effects model (asterisks to indicate significance of overall trends in Fig. 2).

Discussion

In this study, we produced a derived population parameter (stability about a long-term population trajectory) that was independent of mean density, yet related to extinction risk. This measure may be a useful indicator of population persistence for applied conservation. We tested whether predictions of occurrence from SDMs built using species occupancy data were associated with population density and stability from independent data sets. We found that all four SDMs we tested produced probability of occurrence surfaces that were strongly positively correlated with mean population density at monitoring sites. In contrast, the stability of populations over time was poorly predicted by the SDMs.

Models fitted to bird data generally produced a better fit between probability of occurrence from the SDMs and population density than models fitted to butterfly data. This might be due to differences in the biology between the two groups, but it could also simply be due to differences in the monitoring methodology or the different spatial resolution at which SDMs were fitted. It is clear, however, that with regard to the ability of SDMs to predict population density or stability, the two groups show qualitatively similar results.

The fact that outputs from SDMs were strongly correlated with population density is encouraging because density is a key factor for population persistence (Pimm, Jones & Diamond 1988). Previous studies have found only equivocal relationships between population density and predictions from SDMs built using detection/non-detection data (Pearce & Ferrier 2001; Nielsen et al. 2005; Jiménez-Valverde et al. 2009). The goodness-of-fit of the single species relationships we tested was not very high (mean R2 = 0·13). Hence, for any given probability of occurrence score, we could not confidently predict density accurately. This is to be expected as many processes that affect mean population density (such as variation in habitat quality within biotope patches) were not included in our SDMs. We can conclude, however, that landscapes that are predicted to be highly suitable by SDMs should, on average, host larger populations. Empirical and theoretical evidence suggests that such populations will be more resilient to demographic and environmental stochasticity (Pimm, Jones & Diamond 1988). The fact that landscapes estimated as more suitable by SDMs also host denser populations makes sense, because they have larger areas of the specific biotopes that are associated with species presence, and the amount of suitable habitat is also a key factor limiting population size. In addition, high probabilities of species occurrence will be estimated in regions that are most climatically suitable. Studies of abundance and distribution often find that in the most climatically suitable areas, at the centre of the range, species populations are denser (Brown 1984; Sagarin & Gaines 2002).

The ability of SDMs to predict population stability was extremely poor, especially for butterflies. For only two model types (RF and ANN) and only for the bird data were the probability of occurrence surfaces produced by the SDM significantly associated with population stability; but, even in these cases, the R2 values were very low. For these models, landscapes predicted to have a higher probability of species occurrence also hosted more stable populations. Given that our measure of population stability accounts for the population size at any given site (i.e. by taking residuals from the mean population abundance–variability relationship), this result is unlikely to be simply because of these landscapes supporting larger populations of species, with larger populations being more stable than smaller populations (cf. Taylors Power Law; Taylor 1961). Instead, the positive relationship between population stability and probability of occurrence from SDMs might be because larger areas of favourable biotopes are less vulnerable to spatially limited disturbance or habitat degradation than smaller patches. Also, increasing habitat area can lead to a surpassing of minimum threshold areas below which fragmentation and isolation have a highly detrimental effect on population persistence (Andrén 1994). Finally, it may be that larger areas of favourable biotopes contain more intrinsic heterogeneity, which increases the stability of populations in the face of environmental stochasticity (Weiss, Murphy & White 1988; Kindvall 1996).

That habitat heterogeneity can be important for maintaining stability may explain the generally poor fit between the probability of occurrence scores from SDMs and population stability. The occurrence and density of species populations are maximised by larger areas of the best habitat types. Marginal habitat types, in contrast, may have little effect on increasing the probability of occurrence or mean density. In the face of climatic extremes, however, marginal habitats may provide refuges for populations. Hence, a diverse range of habitat types, including both the ‘best’ habitats and marginal habitats may promote population stability and persistence (Piha et al. 2007; Oliver et al. 2010). As an example, thermophilous insect species may prefer short turf, south-facing hillsides in high-latitude regions in years of average climate (Davies et al. 2006). Hence, area of this habitat type will be the best predictor of occurrence and density (Davies et al. 2005). In rare exceptionally warm and drought years, however, these habitat types may become unsuitable and populations can only persist if taller turf and cooler topographies are also available (Weiss, Murphy & White 1988; Davies et al. 2006). Hence, the availability of a range of marginal habitat types may be an important predictor of stability, yet these landscape characters are not selected by SDMs that only optimise fit to detection/non-detection data.

In addition to the influence of landscape heterogeneity on population stability, the configuration of different habitat types may also be important. Most SDMs do not include habitat configuration as an explanatory variable (just the area of different biotopes); even if they did, the models just optimise fit to species’ detection/non-detection data. As with our discussion of habitat heterogeneity above, landscape structures that promote patch occupancy are likely to be different from those that promote population stability.

An alternative explanation for the weak relationships between population stability and probability of occurrence scores from SDMs may be limitations of the input data. First, detectability of species may vary between habitat types, which would potentially cause errors in our measures of species presence, density and stability and their concurrent inter-relationships (MacKenzie et al. 2006). However, in the light of our familiarity with the recording methodologies, we believe that differential detectability between habitat types is not such a big problem. More of a problem may be the fact that our SDMs are built using a single ‘time-slice’ of environmental data, which also had some temporal mismatch with the bird monitoring data. In theory, it would be possible to compare the change in probability of occurrence from SDMs built over two or more discrete periods, thereby assessing the effects of temporal change in environmental factors. With this more refined approach, population stability over time might be better predicted using SDMs. It would be worthwhile testing this hypothesis if/when data became available. Currently, however, we have used the best available environmental and monitoring data. Species atlases are produced when most sampling areas are deemed to have sufficient recorder effort, which can take a number of years. In addition, remotely sensed land cover maps are produced at discrete intervals, often depending on funding and processing time and because of changes in earth observation methodology, direct comparison between maps is not always feasible. Therefore, researching temporal non-stationarity in species occurrence–environmental relationships was not currently possible, but in the future would be aided by better co-ordination between remote sensing and species monitoring schemes, and better comparability between land cover map versions. However, in defence of the approach here, we have used SDMs in a way that the vast majority of researchers use them (because of similar data limitations). Hence, this study is a pragmatic test of the way that SDMs are currently used and whether the outputs are correlated with population parameters from independent monitoring schemes.

To conclude, we find that SDMs built using species detection/non-detection data can produce probability of occurrence estimates that are reasonable predictors of population density. This is reassuring because these models are being used increasingly in applied conservation, and population density, rather than simple species occurrence, is an important parameter of persistence in the face of demographic and environmental stochasticity. Less encouraging is the result that probability of occurrence estimates from SDMs are generally poor predictors of population stability; especially as we show that, in addition to density, population stability is a significant factor reducing the risk of local extinction.

Our results suggest that, to understand species persistence under rapid environmental change, count data from standardised monitoring schemes are an invaluable resource. These data provide additional insights into the factors affecting species’ extinction risks, which cannot easily be inferred from species’ occupancy data. An over-reliance on the results from SDMs built using occupancy data could potentially lead to an exclusive focus on landscape structures that promote patch occupancy and density, but miss features important for population stability. Other landscape metrics that take into account habitat heterogeneity or configuration, and the development of temporally explicit SDMs may be required to predict population stability, and the additional variance in extinction risk that this parameter can explain. We suggest that future work should give more emphasis to identifying the landscape attributes that contribute to population stability, to provide a sound evidence base for management actions to reduce species extinction risk in the face of rapid environmental change.

Acknowledgements

We are grateful to all the volunteers for collecting a submitting data to the UKBMS and BBS monitoring schemes and to the two atlas projects. The UKBMS is funded by a multi-agency consortium led by Defra, and including CCW, JNCC, FC, NE, NERC, NIEA, SNH (see Supporting information for abbreviations). The BBS is a partnership scheme between the BTO, JNCC and RPSB. Work by the BTO was undertaken through a Partnership jointly funded by BTO and JNCC (on behalf of CCW, NE, CNCC and SNH). JNCC and NE contributed funding towards this study. We thank Chris Cheffings at JNCC for helpful discussion.

Ancillary