Bias from heterogeneous usage of space in spatially explicit capture–recapture analyses

Authors


Summary

  1. Royle et al. (Methods in Ecology and Evolution, 2013, 4, 520) proposed a spatially explicit capture–recapture (SECR) model in which an animal's usage of a site, and hence its probability of detection, depends on a function of site-specific covariates normalized using a weighted sum of such values across the animal's home range.
  2. From simulations supposedly based on the model, they drew the conclusion that existing methods will produce ‘extremely biased’ estimates of population size when animals use space selectively. This conclusion is faulty because they simulated data from a different model, omitting the normalization needed to represent selection of resources at the home-range level.
  3. New simulations show that the null SECR estimator of population size is nearly unbiased for low to moderate levels of selective space use when the generating model includes normalization. Including detector-level covariates of detection, as allowed in standard software, nearly eliminates bias due to strongly selective space use, whether or not the generating model includes normalization.

Introduction

In a recent paper, Royle et al. (2013) developed a spatially explicit capture–recapture (SECR) model that explicitly links individual space use and the probability of detecting an animal in a detector (trap) at a particular point. Space use may itself be modelled as a function of spatial covariates using what is termed a ‘resource selection function’ (RSF), and this is an attractive way to jointly model SECR data and more intensive observations from telemetry. Royle et al. (2013) drew the secondary conclusion that ‘...an ordinary [SECR] model with symmetric encounter probability model produces extremely biased estimates of [population size] N when the population of individuals exhibits resource selection...’. Biologists who believe that resource selection is a near-universal phenomenon may reasonably infer that SECR is unreliable except when combined with resource selection models. I show that this inference is mistaken and note that a model with spatial covariates of detection (Royle et al. 2013) may be fitted with existing software.

Model of space use and detection

Royle et al. (2013) use a home-range utilization model that combines a symmetric (radial) decline in usage with selection in response to one or more spatial covariates. For an individual centred in pixel s, and a single resource covariate z(x), the proportion of time (i.e. use) in pixel x determined jointly by the radial and selective components is

display math(eqn 1)

where α1 = 1/(2σ2) and α2 are fitted coefficients and d(x, s) is the distance between x and s (Royle et al. 2013 eqn 4; I refer readers to the original for further detail). The cumulative hazard λ(x|s) of detecting the animal at x in a particular time interval, given usage as in eqn (eqn 1), is controlled by a further parameter α0: λ(x|s) = exp(α0)π(x|s), and the corresponding probability of detection is

display math(eqn 2)

The denominator in eqn (eqn 1) serves to normalize pixel-specific usage values for any individual centred at s; it is essentially the resource surface smoothed using the home-range kernel −α1d(x, s)2, and therefore represents the total resource available from any point on the landscape, weighted inversely by distance. The normalized resource selection function is a probability density (i.e. ∑x π(x|s) = 1) (Royle et al. 2013). This property is not essential for modelling detection probability in SECR as the transformation in eqn (eqn 2) accepts any positive value of λ, and the intercept of P is controlled by α0.

Rather, the reason for normalizing is to model the selection of resources as relative to the resources available within the particular home range of each animal. It is critical when modelling resource selection to define availability appropriately (e.g. Johnson 1980; Manly et al. 2002; Buskirk & Millspaugh 2006; Johnson et al. 2008; Forester, Im & Rathouz 2009). Royle et al. 2013, p. 522) emphasize the importance of both normalization and availability without directly connecting the two. For example, they state that eqn (eqn 1) (their eqn 4) ‘...recognizes that apparent space usage is a product of both availability (governed here by a Gaussian kernel) and use conditional on availability’. However, they omit normalization without comment when describing the capture–recapture likelihood (p. 524 and their Appendix S1) and in their own R code and capture–recapture simulations (their Appendix S2 and S3).

Without normalization, eqn (eqn 1) corresponds to a model with a linear covariate on the log scale that is specific for a detector located at x and depends on s only in the distance term. Individuals with home ranges overlapping x experience a probability of detection dependent only on z(x) and the distance d(x, s), and equidistant animals have the same probability of detection. However, the original model (eqn (eqn 1)) envisaged that usage of x by equidistant animals would depend also on the mix of resources (other pixels) available to each animal within its home range. An individual in a generally ‘poor’ home range is thus more likely to use a pixel of ‘medium’ quality than an individual in a generally ‘rich’ home range. When combined with positive spatial autocorrelation in pixel quality, this tends to damp realized variation in pixel-specific detection probability: ‘poor’ pixels do not look so bad when the choice is other ‘poor’ pixels, and conversely for ‘rich’ pixels.

The differing patterns of space use represented by models with and without normalization are shown most clearly by focussing on the usage of a single ‘detector’ pixel by animals centred nearby but in differing habitats. I illustrate this with an example based on an autocorrelated habitat variable as in Royle et al. (2013), simulated as a Gaussian random field over a map of 80 × 80 pixels using a spatial scale parameter of five pixels (Fig. 1a). Detection probability was predicted from eqn (eqn 2) with α1 = 1/(2 × 7·52) and α2 = 1. The intercept parameter was varied to achieve similar overall rates of detection (α0 = 3·5 in the normalized example and α0 = −4 in the unnormalized example). Figure 1b shows the asymmetric variation in use probability of a central pixel predicted by the normalized model as a function of the home-range centroid of a focal animal, and Fig. 1c shows the symmetric pattern of use predicted by the unnormalized model.

Unfortunately, normalization makes the SECR detection model unwieldy, and substantial additional processing is required for each evaluation of the likelihood. The formulation also raises problems of parameter interpretation. If we use C for the denominator in eqn (eqn 1), then λ(x|s) = 1− exp{−exp[α0 − α1d(x, s)2 + α2z(x) − log(C)]}. For constant C, this is a linear model on the complementary log–log scale (Royle & Gardner 2011), with ‘baseline encounter rate’ in the sense of Royle & Gardner (2011) equal to  exp[α0 − log(C)]. However, C is not constant, but depends nonlinearly on α1, α2 and the mix of habitat near point s. In consequence, the detection model is no longer linear on the complementary log–log scale, and the baseline encounter rate is not well defined.

Effect of normalizing π(x|s) on bias in math formula

I simulated data sets with and without normalization of π(x|s) in the generating model to assess the effect of fitting unnormalized models on bias in math formula. In scenarios with normalization, the value of α0 was increased by  log(2πσ2) to maintain approximately the same number of detected individuals; scenarios otherwise followed Royle et al. (2013). Normalization introduced an edge effect (Appendix S2) that was reduced by adding an extra buffer of 1·5σ (three pixels) on all sides of the square region in which animals were simulated. Two models were fitted: a null SECR model and a model with a linear detector-specific covariate on the link scale (SCR and SCR+p(z) in the notation of Royle et al ).

Fitting a null model to data generated without normalizing π(x|s) resulted in estimated relative bias (math formula) of –16·6% (SE 0·4%) for α2 = 1. The effect of including normalization in the generating model was to decrease math formula to –5·7% (SE 0·5%). Fitting a model with a detector-level covariate for λ0 resulted in a small positive bias in the unnormalized case (math formula +1·3% (SE 0·4%)) that increased slightly when the generating model included normalization (math formula +1·9% (SE 0·5%)).

Royle et al. 2013, Appendix S3 , Table 2) reported greater bias from a misspecified null model with α2 = 1 (–18·8% to –19·5%) than my average (–16·6%). Their simulations included telemetry data for 2–16 animals and appear to have been based on a single realization of the spatial covariate (Royle et al. 2013; Fig. 1), whereas I drew a new value of the Gaussian random field for each replicate. This is sufficient to explain the discrepancy.

Figure 1.

Example of habitat selection in a heterogeneous landscape. (a) habitat covariate z(x) (green = low quality, white = high quality), (b) normalized selection model (eqn (eqn 1)) and (c) unnormalized selection model. In (b) and (c), the use probability P(x|s) of a central pixel (‘+’) is shown as a function of the home-range centroid s of a focal individual. The red circle in each plot indicates a set of locations equidistant from the central pixel in which two points are identified: s1 in a low-quality neighbourhood and s2 in a high-quality neighbourhood; under the normalized resource selection model (b), the probability of use of the central pixel is high for s1 and low for s2, whereas the unnormalized model (c) used for SECR by Royle et al. (2013) did not differentiate according to the habitat near s.

Bias and the magnitude of spatial heterogeneity

I have considered the particular scenario of Royle et al. (2013) in which the habitat covariate was a standard Gaussian random field z(x) on the link scale, spatial autocorrelation was controlled by an exponential covariance kernel (V =  exp(−rij/w) where rij is the distance between points i and j, and w = 5 pixels), and α2 = 1·0. Under these conditions, the absolute value of the relative bias in math formula may readily be controlled to <6%, whether or not the generating model uses a normalized π(x|s).

The coefficient α2 varies the effect of the spatial covariate z(x) on λ, and hence on spatial heterogeneity in detection (the mean of λ is also affected by α2, even if z(x) is standardized to mean 0, because the link scale is nonlinear). Given standard Gaussian variation in z(x), bias in math formula is negligible (<2%) under the null model for α2 < 0·3, and large (>10%) only for α2 > 0·6 (Fig. 2). The bias is negligible even for α2 > 0·6 if z(x) is observed and included in the model. We have almost no information regarding the range of plausible values for α2, and the case against even the null SECR model is therefore weak. In their bear example, Royle et al. (2013) standardized the spatial covariate (presumably to mean 0 and SD 1) and obtained estimates of α2 from capture–recapture data alone that did not differ significantly from zero, including data from three telemetered bears gave math formula (SE 0·1).

Royle et al. 2013, p. 528) observed that ‘the effect of resource selection by individuals is to induce heterogeneity in encounter probability as a result of heterogeneity in the landscape’. This is especially the case under their unnormalized model when the habitat covariate is autocorrelated on the home-range scale, as then the detectors near an individual's home-range centroid tend to share either high or low values of the habitat covariate. Simulations (Fig. 2) indicate that when the space use of animals is governed by eqn (eqn 1) (including home-range scale normalization), the effect is greatly reduced and possibly even eliminated for moderate values of α2 (α2 ≤ 0·3). The normalized model is a more plausible representation of third-order habitat selection (sensu Johnson 1980) as it varies the absolute use of a resource according to the alternatives available in an animal's home range. The simulations of Royle et al. (2013) do not illustrate an effect of resource selection per se, but rather the potential for heterogeneity and bias under a particular, hypothetical and biologically less plausible notion of ‘selection’.

Figure 2.

Relative bias of estimated population size vs site-specific heterogeneity of detection probability α2 (see text). Simulated data (N = 200) were generated with or without individual-level normalization of space use. The fitted SECR model was either null or included a linear spatial covariate (at the detector location) on the log scale. Sampling design followed simulations in Royle et al. (2013); see also Appendix S1 and S2. Average of 500 simulations; bars indicate 95% CI. Unnormalized space use, null model (◯); unnormalized space use, spatial covariate (●); normalized space use, null model (□); and normalized space use, spatial covariate (math formula).

Comments on usage-based detection model

Coupling of telemetry data and capture–recapture data via a shared resource selection model rests on the assumption that the cumulative hazard of detection is directly proportional to observed space use. The assumption is often reasonable, but I note that proportionality is not inevitable – it may be upset by any spatial variation in behaviour towards detectors (e.g. avoidance of novel objects distant from home-range centre) or by telemetry error. The Royle et al. (2013) model also assumes that the underlying scale of movement (σ) is constant, when it is more likely that animals respond to varying average resource density by changing their scale of movement. This level of response should be considered in future models. However, variable σ should not be expected to cause additional bias in a normalized model because normalization implies perfect compensation in the sense of Efford & Mowat (in press).

Conclusion

Spatially explicit capture–recapture models rely on the idea that the probability of detection declines with the distance between an animal's home-range centre and a detector. However, this distance is clearly not the only source of variation in detection probability, and existing likelihood-based methods (Borchers & Efford 2008) and software (Efford, Dawson & Robbins 2004; Efford 2013) explicitly allow for covariates of detection at the level of individuals, sampling times and detector locations. Royle et al. (2013) proposed a model in which detection was a joint function of spatial covariates at the location of the animal (the smoothed resource surface) and the location of the detector (the pixel-specific resource), but in their own capture–recapture analyses and code they used only a simpler detector-specific covariate model. Seen in this light, their simulations and others presented here provide support for the continued use of existing models, with spatial covariates included as appropriate, and use of their novel resource selection model with SECR data remains untested.

Acknowledgements

I thank David Borchers and the Associate Editor for their helpful comments.

Ancillary