Integrating resource selection information with spatial capture–recapture



  1. Understanding space usage and resource selection is a primary focus of many studies of animal populations. Usually, such studies are based on location data obtained from telemetry, and resource selection functions (RSFs) are used for inference. Another important focus of wildlife research is estimation and modeling population size and density. Recently developed spatial capture–recapture (SCR) models accomplish this objective using individual encounter history data with auxiliary spatial information on location of capture. SCR models include encounter probability functions that are intuitively related to RSFs, but to date, no one has extended SCR models to allow for explicit inference about space usage and resource selection.
  2. In this paper we develop the first statistical framework for jointly modeling space usage, resource selection, and population density by integrating SCR data, such as from camera traps, mist-nets, or conventional catch traps, with resource selection data from telemetered individuals. We provide a framework for estimation based on marginal likelihood, wherein we estimate simultaneously the parameters of the SCR and RSF models.
  3. Our method leads to increases in precision for estimating parameters of ordinary SCR models. Importantly, we also find that SCR models alone can estimate parameters of RSFs and, as such, SCR methods can be used as the sole source for studying space-usage; however, precision will be higher when telemetry data are available.
  4. Finally, we find that SCR models using standard symmetric and stationary encounter probability models may not fully explain variation in encounter probability due to space usage, and therefore produce biased estimates of density when animal space usage is related to resource selection. Consequently, it is important that space usage be taken into consideration, if possible, in studies focused on estimating density using capture–recapture methods.


Spatial capture–recapture (SCR) models are relatively new methods for inference about population density from capture–recapture data using auxiliary information about individual capture locations (Efford 2004; Borchers & Efford 2008; Royle & Young 2008). SCR models posit that N individuals are located within a region denoted math formula. Each individual has a home range or activity area within which movement occurs during some well-defined time interval, and the center of the animal's activity has Cartesian coordinates math formula for individuals i = 1,…,N. The population is sampled using J traps with coordinates math formula for j = 1,…,J, and encounter probability is expressed as a function of the distance between trap location (math formula), and individual activity or home range center (math formula). While SCR models are a relatively recent innovation, their use is already becoming widespread (Efford et al. 2009; Gardner et al. 2010a, b; Kéry et al. 2011; Gopalaswamy et al. 2012; Foster & Harmsen 2012) because they resolve critical problems with ordinary non-spatial methods such as ill-defined area sampled and heterogeneity in encounter probability due to the juxtaposition of individuals with traps (Borchers 2012).

Despite the increasing popularity of SCR models, every application of them has been based on encounter probability models, such as the bivariate normal distribution, that imply symmetric and stationary (invariant to translation) models for home range. While such simple models might be necessitated in practice by sparse data, home range size and shape are often not well represented by stationary distributions because animals select resources that are unevenly distributed in space. Therefore, more complex models are needed to relate the capture process with the way in which individuals utilize space.

In this paper we extend SCR capture probability models to accommodate models of space usage or resource selection, by extending them to include one or more explicit landscape covariates, which the investigator believes might affect how individual animals use space within their home ranges [this is what Johnson (1980) called third-order selection]. We do this in a way that is entirely consistent with the manner in which parameters of classical resource selection functions (RSFs; Manly et al. 2002) or utilization distributions (UD; Worton 2012; Fieberg & Kochanny 2005; Fieberg 2007) are estimated from animal telemetry data. In fact, we argue that SCR models and RSF/UD models estimated from telemetry are based on the same basic underlying model of space usage. The important distinctions between SCR and RSF studies are that (i) resource selection studies do not result in estimates of population density because models for telemetry data do not allow for modeling of the encounter process (i.e., the sample size of individuals is fixed); and (ii) in SCR studies, encounter of individuals is imperfect (i.e., ‘p < 1’) whereas, with RSF data obtained by telemetry, encounter is perfect. With respect to the latter point, we can think of the RSF and SCR studies as being exactly equivalent either if we have a dense array of trapping devices, or if our telemetry apparatus samples time or space imperfectly. Then, observed use by individuals of trap locations can be modeled precisely as thinned telemetry data. Thus, the key concept of our paper is that, to integrate SCR and RSF data, we formulate both models in a manner which makes them consistent with respect to some underlying space utilization process. Telemetry data produce direct observations of space usage, and SCR data result from a thinning of such data.

The modeling framework we develop here simultaneously resolves three important problems: (i) it generalizes all existing capture probability models for SCR data to accommodate realistic patterns of space usage that result in asymmetric and irregular home ranges; (ii) it allows estimation of RSF parameters directly from SCR data, i.e., absent telemetry data; and (iii) it provides the basis for integrating telemetry data directly into SCR models to improve estimates of model parameters, including density. Our model greatly expands the applied relevance of SCR methods for conservation and management, and for addressing applied and theoretical questions related to animal space usage and resource selection.

In the following section we provide an introduction to ordinary SCR models, followed by a brief discussion of topics in resource selection models necessary to integrate the two types of models. A formal development of the combined likelihood for SCR and telemetry data is given in Section “The combined RSF/SCR likelihood”, followed by an application of the new model to a study of black bears in southwestern New York, and a brief simulation study to evaluate properties of the parameter estimators based on the combined data. We conclude with some general discussion, and directions for future work.

Spatial capture–recapture

A number of distinct observation models have been proposed for SCR studies (Borchers & Efford 2008; Efford et al. 2009; Royle et al. 2009), including Poisson, multinomial, and binomial observation models. Here we focus on the binomial model in which we suppose that the J traps are operated for K periods (e.g., nights), and the observations are individual (i) and trap (j) specific counts math formula, which are binomial with sample size K and encounter probabilities math formula which depend on trap locations math formula and individual activity centers math formula as described subsequently. The vector of trap-specific counts for individual i, math formula is its encounter history. A standard encounter probability model (Borchers & Efford 2008) is the Gaussian model in which

display math(eqn 1)

or, equivalently, math formula, where math formula is the Euclidean distance between points math formula and math formula,

display math

math formula and math formula. A similar model is the Gaussian hazard model:

display math(eqn 2)

where cloglog(u) =  log (− log (1−u)) is the complementary log-log function. Alternative detection models are used, but all are functions of Euclidean distance and so we do not consider them further here.

The primary motivation behind our work is that, in all previous applications of SCR models, simple encounter probability models based only on Euclidean distance have been used, with estimation based on standard likelihood or Bayesian methods. These methods regard the activity center for each individual i, math formula, as latent variables and remove them from the likelihood either under a model of ‘uniformity’ in which math formula where math formula is a spatial region (the ‘state-space’ of s), or a model in which covariates might affect the spatial distribution of individuals (Borchers & Efford 2008). The state-space math formula defines the potential values for any activity center s, e.g., a polygon defining available habitat or range of the species under study.

A potential shortcoming of standard SCR models is that the encounter probability model based on Euclidean distance is unaffected by habitat or landscape structure, and it implies that the space used by individuals is stationary and symmetric, which may be unreasonable in many applications. For example, if the common detection model based on a bivariate normal probability distribution function is used, then the implied space usage by all individuals, no matter their location in space or local habitat conditions, is symmetric with circular contours of usage intensity. Subsequently we provide an extension of this class of SCR models that accommodates asymmetric, irregular and spatially heterogeneous models of space usage. Thus, ‘where’ an individual lives on the landscape, and the state of the surrounding landscape, will determine the character of its usage of space. In particular, we suggest encounter probability models that imply irregular, asymmetric and non-stationary home ranges of individuals and that are sensitive to the local landscape in the vicinity of an individual's activity center.

Basic model of space usage

We develop the model here in terms of a discrete landscape purely for computational expediency. This formulation will accommodate the vast majority of actual data sets, as almost all habitat or landscape structure data comes to us in the form of raster data. Let math formula identify the center coordinates of a set of nG pixels that define a landscape. In SCR studies, a subset of the coordinates x will correspond to trap locations where we might observe individuals whereas, in telemetry studies, animals are observable (by telemetry fixes) at potentially all coordinates.

We will define ‘use of x’ to be the event that an individual animal appeared in some pixel x at any instant in time. As a biological matter, use is the outcome of individuals moving within their home ranges (Aarts et al. 2008; Johnson et al. 2008; Forester et al. 2009; Hooten et al. 2010), i.e., where an individual is at any point in time is the result of some movement process. However, to understand space usage, it is not necessary to entertain explicit models of movement, just to observe the outcomes, and so we don't elaborate further on what could be sensible or useful models of movement.

Let z(x) denote a covariate, such as elevation or habitat type, measured (or defined) for every pixel x. For clarity, we develop the basic ideas here in terms of a single covariate but, in practice, investigators typically have more than 1 covariate, which poses no additional problems. Suppose that an individual is monitored over some period of time and a fixed number, say R, of use observations are recorded. Let M(x) be the use frequency of pixel x for that individual. i.e., the number of times that individual used pixel x during some period of time. We assume the following probability distribution for the nG×1 vector of use frequencies:

display math

where π is the nG×1 vector of use probabilities with elements (for each pixel):

display math(eqn 3)

This is the standard RSF model (Manly et al. 2002) used to model telemetry data. The parameter math formula is the effect of the landscape covariate z(x) on the relative probability of use. Thus, if math formula is positive, the relative probability of use increases as the value of the covariate increases. In general if math formula then use is ‘selective’ because z(x) is used disproportionately to its availability. We define availability below, but first we note that in practice, we don't get to observe {M(x)} for all individuals but, instead, only for a small subset say math formula, which we capture and install telemetry devices on. For the telemetered individuals, we assume they behave according to the same RSF model as the population as a whole, which might be justified if individuals are randomly sampled from the population.

An animal will typically not be capable of using all the resources (pixels) in the landscape, and since we cannot know what is available a priori, we need to extend eqn 3 so that availability can be estimated. Let s denote the centroid of an individual's home range and let d(s,x) = ‖xs‖ be the distance from the activity center s of some individual to pixel x, and let M(x,s) denote the use frequency of pixel x for an individual with activity center s. We modify the space usage model to accommodate that space use will be concentrated around an individual's activity center (Johnson et al. 2008; Forester et al. 2009):

display math(eqn 4)

where math formula describes the rate at which encounter probability declines as a function of distance, d(x,s). In terms of availability, if math formula, then all pixels in the landscape are available, and as math formula increases home range size decreases such that only the pixels close to the home range center are available. From ordinary telemetry data, it would be possible to estimate parameters math formula, math formula and also the activity centers s using standard likelihood methods based on the multinomial likelihood (Johnson et al. 2008). We note that this form (eqn (eqn 3)) arises explicitly as a limiting form of the Gaussian process movement model of Johnson et al. (2008). Conceptually, it recognizes that apparent space usage is a product of both availability (governed here by a Gaussian kernel) and use conditional on availability. Thus, the marginal probability of use is a product of the two components (Johnson et al. 2008; Forester et al. 2009).

Note that eqn (eqn 3) resembles standard encounter models used in SCR but with an additional covariate z(x). The main difference between this observation model and the standard SCR model is that the model here includes the normalizing constant math formula, which ensures that the use distribution is a proper probability density function. Thus we are able to characterize the probability of encounter in terms of both distance from activity center and space use. Note that, under this model for space usage or resource selection, if there are no covariates, or if math formula, then the probabilities π(x|s) are directly proportional to the SCR model for encounter probability. For example, setting math formula, then this implies probability of use for pixel x is:

display math

Therefore, for whatever model we choose for p(x,s) in an ordinary SCR model, we can modify the distance component in the RSF function in eqn (eqn 3) accordingly to be consistent with that model, by choosing π(x|s) according to

display math

As an illustration of space usage patterns under this model, we simulated a covariate that represents variation in habitat structure (Fig. 1) such as might correspond to habitat quality. This was simulated by using a kriging interpolator of spatial noise. Space usage patterns for eight individuals in this landscape are shown in Fig. 2, simulated with math formula with σ = 2 and the coefficient on z(x) set to math formula. These space usage densities – ‘home ranges’ – exhibit clear non-stationarity in response to the structure of the underlying covariate, and they are distinctly asymmetrical. We note that if math formula were set to 0, the eight home ranges shown here would resemble bivariate normal kernels with σ = 2.

Figure 1.

A typical habitat covariate reflecting habitat quality or hypothetical utility of the landscape to a species under study. Activity centers for eight individuals are shown with black dots.

Figure 2.

Space usage patterns of eight individuals under a space usage model that contains a single covariate (shown in Fig. 1). Plotted value is the multinomial probability math formula for pixel j under the model in eqn (eqn 3).

Another interesting thing to note is that the activity centers are not typically located in the pixel of highest use or even the centroid of usage. That is, the observed ‘average’ location is not an unbiased estimator of s under the model in eqn (eqn 3). This occurs when (and, in this case, because) second order selection – the processes determining where activity centers are located on the landscape – is independent of third order selection, which we recognize may not be the case in practice. In general, we could fit a model in which Pr(s) is also affected by landscape covariates z(x) (Borchers & Efford 2008). This generality is tangential to our methodological focus, although we fit such a model in section “Application: New York Black Bear Study”.

Poisson use model

A natural way to motivate this specific model of space usage is to assume that individuals make a sequence of random resource selection decisions so that the outcomes M(x) (for all x) are marginally independent Poisson random variables:

display math


display math

In this case, the number of visits to any particular cell is affected by the covariate z(x) but has a baseline rate (math formula) related to the amount of movement occurring over some time interval. This is an equivalent model to the multinomial model (eqn (eqn 3)) in the sense that, if we condition on the total sample size math formula, then the vector of use frequencies {M(x)} for individual with activity center s, has a multinomial distribution with probabilities

display math

which is the same as eqn (eqn 3) because math formula cancels from the numerator and denominator of the multinomial cell probabilities and thus this parameter is not relevant to understanding space usage. The Poisson formulation also emphasizes that the observed use frequencies are the product of two components related to availability (the distance term) and use conditional on availability [the term related to z(x); Forester et al. 2009]. Note that if use frequencies are summarized over math formula individuals for each pixel, then a standard Poisson regression model for the resulting ‘quadrat counts’ is reasonable. This is the model underlying ‘Design 1’ in Manly et al. (2002).

Random thinning

Suppose our sampling is imperfect so that we only observe a smaller number of telemetry fixes than actual use frequencies. We express this sampling (or ‘thinning’) by assuming the observed number of uses is a binomial random variable based on a sample of size M(x):

display math

where math formula represents the thinning rate, essentially the sampling effectiveness of the device. Then, the marginal distribution of the new random variable m(x) is Poisson but with mean

display math

Thus, the space-usage model (RSF) for the thinned counts m(x) is the same as the space-usage model for the original variables M(x). This is because if we remove M(x) from the conditional model by summing over its possible values, then the vector of thinned use frequencies m (i.e., for all pixels) is also multinomial with cell probabilities

display math

and so the constants math formula and math formula cancel from both the numerator and denominator. Thus, the underlying RSF model applies to the true unobserved count frequencies M and also those produced by a random thinning or sampling process, m.

In summary, if we conduct a telemetry study of math formula individuals, the observed data are the nG×1 vectors of use frequencies for each individual. We declare these data to be ‘resource-selection data’ which are typical of the type used to estimate resource-selection functions (RSFs; Manly et al. 2002). In fact, the situation we have described here in which we obtain a random sample of use locations and a complete census of available locations is referred to as ‘Design 2’ by (Manly et al. 2002).

Resource selection in SCR models

The key to combining RSF data with SCR data is to work with this underlying resource utilization process and formulate SCR models in terms of that process. Imagine that we have a sampling device, such as a camera trap, in every pixel. If the device operates continually then it is no different from a telemetry instrument. If it operates intermittently or does not expose the entire area of each pixel then a reasonable model for this imperfect observation is the “thinned” binomial model given above. For data that arise from SCR studies, the frequency of use for each pixel where a trap is located serves as an intermediate latent variable that we don't observe. From a design standpoint, the main difference between SCR studies and telemetry is that, for SCR data, we do not have sampling devices in all locations (pixels) in the landscape. Rather, the data are only recorded at a subsample of them, the trap locations, which we identify by the specific coordinates math formula.

So we imagine that the hypothetical perfect data from a camera trapping study are the counts m(x) only at the specific trap locations math formula, and for all individuals in the population i = 1,2,…,N (usually math formula). We denote the individual- and trap-specific counts by math formula for individual with activity center math formula and trap location math formula. In practice, many (perhaps most) of the math formula frequencies will be 0, corresponding to individuals not captured in certain traps. We then construct our SCR encounter probability model based on the view that these frequencies math formula are latent variables. In particular, under the SCR model with binary observations, we observe a random variable math formula if the individual i visited the pixel containing trap j and was detected. We imagine that the observable variable math formula is related to the latent variable math formula being the event math formula. Therefore,

display math


display math

This is the complementary log–log link relating math formula to math formula, setting math formula:

display math


display math

and we collect the constants so that math formula is the baseline encounter rate which includes the constant intensity of use by the individual and also the baseline rate of detection, conditional on use.

The combined RSF/SCR likelihood

To construct the likelihood for SCR data when we have auxiliary covariates on space usage or direct information on space usage from telemetry data, we regard the two samples (SCR and RSF) as independent of one another. In practice, this might not always be the case but it may be reasonable when the telemetry data come from a previous study or if telemetered individuals cannot be recognized. In cases where we can match some individuals between the two samples, regarding them as independent should only entail a minor loss of efficiency because we are disregarding more precise information on a small number of activity centers. Moreover, we believe, it is unlikely in practice to expect the two samples to be completely reconcilable and that the independence formulation is the most realistic.

Regarding the two data sets as being independent, our approach here is to form the likelihood for each set of observations as a function of the same underlying parameters and then combine them. In particular, let math formula be the likelihood for the SCR data in terms of the basic encounter probability parameters and the total (unknown) population size N, and let math formula be the likelihood for the RSF data based on telemetry which, because the sample size of such individuals is fixed, does not depend on N. Assuming independence of the two datasets, the likelihood is the product of these two pieces:

display math

Additional technical details for formulating each component of the likelihood are given in the Appendix S1, Supporting information. An r (R Development Core Team 2006) function for obtaining the maximum likelihood estimates (MLEs) of model parameters is given in the Appendix S2, Supporting information.

We note that we adopt the ‘binomial form’ of the likelihood which contains population size N as an explicit parameter, which applies to the landscape defined by math formula. Given math formula, density is computed as math formula. In our simulation study below we report N as the two are equivalent summaries of the data once math formula is defined. Borchers & Efford (2008) develop a likelihood based on a further level of marginalization, in which N is removed from the likelihood by averaging over a Poisson prior for N.

Application: New York black bear study

We applied our integrated SCR+RSF model to data from a study of black bears in a region of c. 4600 kmmath formula in southwestern New York (Sun et al., in preparation ). The focus of the study was to (i) understand recent population growth and range expansion by black bears into areas with higher levels of human density and agriculture than the traditional range, and (ii) evaluate resource selection and movements in relation to landscape characteristics. A noninvasive, genetic, mark-recapture study was conducted to estimate density, and a concurrent telemetry study was conducted in the same study area to understand patterns of landscape connectivity and space usage.

Hair snares (n = 103) were set from 6 June to 9 July, 2011 and were used to obtain DNA to identify individual bears. Hair snares consisted of a strand of barbed wire c. 30 m in length set around trees at 50 cm above flat ground, in forested areas (Woods et al. 1989), distributed throughout the study area (Fig. 3). Hair snares were baited and scented and checked weekly for hair. Hair samples were genetically analyzed using seven microsatellite markers. See Sun (in preparation) for details of the genetic analysis. The study yielded captures of 33 individuals and a total of 14 recaptures (27 individuals captured one time only; three individuals captured twice; one individual each three and four times). Extra trap recaptures included three individuals captured in two traps, one individual in each of three and four traps. We used data from three radio-telemetered individual bears (2M, 1F) from the same time period as the SCR data. Radio fixes were obtained approximately once per hour and a total of 1948 fixes on the three individuals were obtained. We thinned these hourly fixes to once per 10 h to approximate the data as independent selection decisions, producing 195 telemetry locations used in the RSF component of the model. We note that the model does not require that locations in geographic space be independent, but rather that the selection outcomes are independent draws from the assumed multinomial use distribution. We imagine that active selection should generally result in the appearance of dependent movement. We used the covariate elevation in the model, derived from a one arc-second digital elevation model (USGS National Elevation Dataset, accessed June 2012). This is shown in Fig. 4 (on a standardized scale) which also shows the locations of each capture (multiple captures at a trap location are dithered by adding random noise).

Figure 3.

Black bear study area in southwest New York, USA (left panel). Locations of 103 hair snares (right panel).

Figure 4.

Elevation (standardized) and location of bear captures (multiple captures at same trap off-set by random noise).

We fitted a sequence of models based on the Gaussian hazard model (eqn (eqn 2)) including an ordinary SCR model with no covariates or telemetry data, the SCR model with elevation affecting either math formula or density D(x), and models that use telemetry data. We have not discussed modeling covariate effects on density, but such models are described by Borchers & Efford (2008) and we have not provided any novel treatment of that modeling aspect here. The full list of models is as follows:

  • Model 1: SCR – ordinary SCR model
  • Model 2: SCR+p(z) – ordinary SCR model with elevation as a covariate on baseline encounter probability math formula.
  • Model 3: SCR+D(z) – ordinary SCR model with elevation as a covariate on density only.
  • Model 4: SCR+p(z)+D(z) – ordinary SCR model with elevation as a covariate on both baseline encounter probability and density.
  • Model 5: SCR+p(z)+RSF – SCR model including data from 3 telemetered individuals.
  • Model 6: SCR+p(z)+RSF+D(z) – SCR model including telemetered individuals and with elevation as a covariate on density.

The first four models can be viewed together for purposes of model-selection by AIC since they are nested models. The last two models can be viewed together but cannot be compared to the first four because they include telemetry data. The results of fitting these six models – the parameter estimates and standard errors are shown in Table 1. We provide a full R script for fitting all of these models to simulated data in the Appendices S1–S3, Supporting information.

Table 1. Summary of model-fitting results for the black bear study. Parameter estimates are: Intercept of the encounter probability model math formula and σ is the scale parameter of the half-normal hazard rate encounter model. The SCR data are based on n = 33 individuals, and the telemetry data are based on three individuals. Standard errors for each estimate are given in parentheses
Model math formula  log (σ) math formula math formula β −loglik
SCR+p(z)−2·8600−1·11700·17504·1400 122·7380
SCR−2·7290−1·12204·1100 122·9900
SE(0·3454)(0·1404) (0·3618)  
SE(0·3526)(0·1394) (0·3575)(0·4083) 
SCR+RSF−3·0680−0·8140−0·28103·8840 1271·7390

Among models 1–4, those models without the telemetry data, we see that the two models with elevation affecting density are preferred – and, there is a large positive response to elevation. This is consistent with the visual pattern apparent in Fig. 4 where we see individual captures favoring high elevation sites. We also see a negative effect of elevation on space usage (the parameter math formula). It is interesting that the sign of the estimate of math formula changes from positive to negative when we add elevation as a covariate on density. Thus, the effect of elevation on density appears to have masked its effect on space usage. The estimate of N for the 4600 kmmath formula state-space is about 103 bears (exp(4·26) + 33).

In the two models that include the additional telemetry data, a couple points stand out: Clearly the elevation effect on density is important, reducing the negative log-likelihood by 5 units. The effect of elevation on density and space usage are roughly consistent with Model 4 which did not use telemetry data. Furthermore, the standard errors (SE) of those two parameter estimates are reduced considerably when the model uses telemetry data, as is the SE for estimating  log (σ). The SE for estimating math formula is only improved incrementally compared to the models without telemetry data. We used the best model, SCR+p(z)+RSF+D(z), to produce a map of density (Fig. 5) which shows clearly the pattern induced by elevation.

Figure 5.

Predicted density of black bears (per 100 kmmath formula) in southwestern New York study area.

We also produced a map (Fig. 6) to illustrate the effect of elevation on space usage. This shows the probability of using a pixel x relative to one of mean elevation, and of the same distance from an individual's activity center.

Figure 6.

Relative probability of use of pixel x compared to a pixel of mean elevation, at a constant distance from the activity center.

Simulation study

We carried out a simulation study using the landscape shown in Fig. 1, based on populations of size N = 100 and N = 200 individuals with activity centers distributed uniformly over the landscape. Standard SCR data were generated, along with data from 0, 2, 4, 8, 12, and 16 telemetered individuals to assess the improvement in precision as sample size increases. For all cases we observed 20 independent telemetry fixes per individual, assuming individuals were using space according to a RSF model with the same parameters as those generating the SCR data. More details of the simulation study are given in the Appendix S3, Supporting information.

We simulated 500 data sets for each scenario and, for each data set, we fitted three models: (i) the SCR only model, in which the telemetry data were not used; (ii) the integrated SCR/RSF model which combined all of the data for jointly estimating model parameters; and (iii) the RSF only model which just used the telemetry data alone (and therefore math formula and N are not estimable parameters).

The results are tabulated in the Appendices S1–S3, Supporting information. In terms of RMSE for estimating N, we found that, generally, there is about a 5% reduction in RMSE when we have at least two telemetered individuals. And, although there is a lot of MC error in the RMSE quantities, the reduction is as much as 10% as the sample size of captured individuals increases under the higher N = 200 setting. This incremental improvement in RMSE of math formula makes sense because, while the telemetry provides considerable information about the structural parameters of the model, it provides no information about mean p, i.e. math formula, which comes only from the SCR data. Thus estimating N benefits only slightly from the addition of telemetry data.

The MLE of the RSF parameter math formula exhibits negligible or no bias under both the SCR only and SCR/RSF estimators. The largest improvement from the use of telemetry data comes in estimating the parameter σ. We found that there is as much as a 50–60% reduction in RMSE of estimating math formula, when the telemetry data are used in the combined estimator. Improvement due to adding telemetry data diminishes as the expected sample sizes increases, and so telemetry data does less to improve the precision of math formula and math formula for N = 200 than for N = 100. This is because the SCR data alone are informative about both of those parameters.

The results as they concern likelihood estimation of N suggest that there is not a substantial benefit to having telemetry data. Estimators ‘SCR only’ and ‘SCR/RSF’ both appear approximately unbiased for N = 100 and N = 200, and for any sample size of telemetered individuals. The RMSE is only 5–10% improved with the addition of telemetry information. However, we find that there is substantial bias in math formula if we use the misspecified model that contains no resource selection component. That is, if we leave the covariate z(x) out of the model and incorrectly fit a model with a symmetric and spatially constant encounter model, we see (Appendices S1–S3, Supporting information) about 20% bias in the estimates of N in our limited simulation study. As such, accounting for resource selection is important, even though, when accounted for, telemetry data only improves the estimator incrementally. In addition, we found that the importance of telemetry data is relatively more important for smaller sample sizes. For example, when N = 100, and a lower encounter probability, generating data sets with an average size of 37 individuals (see Appendices S1–S3, Supporting information).


How animals use space is of fundamental interest to ecologists, and important in the conservation and management of many species. Normally this is done by telemetry and models referred to as resource selection functions (RSFs) (Manly et al. 2002). Although SCR models are increasingly being used for estimating density (Efford 2004; Borchers & Efford 2008; Royle 2008; Efford et al. 2009; Royle et al. 2009; Gardner et al. 2010a, b; Kéry et al. 2011; Sollmann et al. 2011; Gopalaswamy et al. 2012; Mollet et al. 2012), until now, they had not been used to understand space usage. However, it is intuitive that space usage should affect encounter probability and thus it should be highly relevant to density estimation in SCR applications. Despite this, essentially all published applications of SCR models to date have been based on simplistic encounter probability models that are symmetric and do not vary across space. One exception is Royle et al. (2012) who developed SCR models that use ecological distance metrics (‘least-cost path’) instead of normal Euclidean distance. Here we developed an SCR model in terms of a basic underlying model of space or resource use that is consistent with existing views of RSFs (Manly et al. 2002).

In developing the SCR model in terms of an underlying model of space usage, we achieve a number of useful extensions of existing SCR and RSF methods:

  1. We have shown how to integrate classical RSF data from telemetry with SCR data based on individual encounter histories obtained by classical arrays of encounter devices or traps. This leads to an improvement in our ability to estimate density, and also an improvement in our ability to estimate parameters of the RSF function. Thus, the combined model is both an extension of standard SCR models and also and extension of standard RSF models. As many animal population studies have auxiliary telemetry information, the ability to incorporate such information into SCR studies has enormous applicability and benefits. While adding RSF data to SCR data may increase precision of the MLE of N only incrementally, the effect can be more substantial in sparse data sets and, generally, RSF produces relatively huge gains in precision in the MLE of σ.
  2. We have shown that one can estimate RSF model parameters directly from SCR data alone. While further exploration of this point is necessary, it does establish clearly that SCR models are explicit models of space usage. Because capture–recapture studies are, arguably, more widespread than telemetry studies alone, this greatly broadens the utility and importance of data from those studies.
  3. It is also now clear that one of the important parameters of SCR models, that controlling ‘home range radius’, can be directly estimated from telemetry data alone. The combined RSF+SCR model yields large improvements in estimation of σ. As a practical matter, this suggests we could estimate σ entirely from data extrinsic to the SCR study, which might provide great freedom in the design of SCR studies. For example, traps could be spaced far enough apart to generate relatively few (even no) spatial recaptures, but dramatically increase the coverage of the population, i.e., the observed sample size of captured individuals relative to N.
  4. Finally, we found that an ordinary SCR model with symmetric encounter probability model produces extremely biased estimates of N when the population of individuals exhibits resource selection, consistent with bias induced by heterogeneity in p in ordinary model math formula (Dorazio & Royle 2003). This is because the effect of resource selection by individuals is to induce heterogeneity in encounter probability as a result of heterogeneity in the landscape. As such, it is important to account for space usage when important covariates are known to influence space usage patterns.

Use of telemetry data in capture–recapture studies has been suggested previously. For example, White & Shenk (1999) and Ivan (2012) suggested using telemetry data to estimate the ‘probability that an individual is exposed to sampling’, but their estimator requires that individuals are sampled in proportion to this unknown quantity, which seems impossible to achieve in many studies. In addition, they do not directly integrate the telemetry data with the capture–recapture model so that common parameters are jointly estimated. Sollmann et al. (in revision) used telemetry data to estimate directly the parameter σ from the bivariate normal SCR model in order to improve estimates of density. They recognized that SCR models are models of space usage, but their model did not include an explicit resource selection component.

Black bear application

Resource selection can be described in hierarchical orders (Johnson 1980), from selection of a geographical area (first-order selection), selection of a home range within a study area (second-order), to selection of resources within that home range (third-order). Animals may select resources at different scales as a result of variability in the distribution of resources on the landscape (Mayor et al. 2009). Indeed, black bears make habitat selection decisions at multiple spatial scales, and decisions made at the second-order can differ from those at the third-order (Lyons et al. 2003; Sadeghpour & Ginnett 2011). As a result of multi-scale resource selection, we can expect that the modeled covariates (elevation in our example) may affect density and space usage differently. We suggest that density is operating at the second-order and is largely related to the spacing of individuals and their associated home ranges across the landscape. On the other hand, our RSF was defined based on selection of resources within the home range (third-order). Because density and our third-order RSF were at different spatial scales, there is no expectation that the modeled covariate describing space usage (elevation) would influence each in a similar manner. Consistent with our positive relationship between elevation and density, the distribution of a black bear population in the central Appalachian Mountains was positively associated with elevation (Frary et al. 2011). At the second-order, however, we observed a negative effect of elevation on space usage. Our study was conducted during summer, and seasonal shifts in elevation have been widely documented in black bears, often attributed to seasonal variation in food availability (Reynolds & Beecham 1980; Graber & White 1983). The negative relationship between elevation and space usage during the summer could be attributable to either access to food resources at lower elevations, or access to river and stream corridors. Within their home ranges, black bears selected areas with high stream densities (Fecske et al. 2002), and in our study area, lower elevations were associated with river corridors which likely provided bears cooler conditions during the heat of summer.

Extensions and future work

We developed a formal analysis framework here based on marginal likelihood (Borchers & Efford 2008). In principle, Bayesian analysis does not pose any unique challenges for this new class of models, although we expect some loss of computational efficiency due to the increased number of times the components of the likelihood would need to be evaluated. We imagine that some problems would benefit from a Bayesian formulation, however. For example, using an open population model that allows for recruitment and survival over time (Gardner et al. 2010a) is convenient to develop in the BUGS language and incorporating information on unmarked individuals has been done using Bayesian formulations of SCR models (Chandler & Royle in press; Sollmann et al. in revision) but, so far, not likelihood methods.

Bayesian analysis might have an advantage in situations where the landscape is characterized by a very fine covariate raster, or even continuous covariates, because the individual activity centers can be updated in the MCMC algorithm by evaluating the likelihood conditional on a single candidate value of s for each individual. Conversely, evaluation of the marginal likelihood becomes tedious and memory intensive as the size of the raster increases, and so some effort has to be made to efficiently calculate the likelihood in such cases (e.g., see Warton & Shepherd 2001). Independent of its effect on integration, raster size is itself an important practical concern. Whenever we have explicit spatial covariates, it is possible that selection is occurring at a much finer resolution than is required to effectively calculate the integrated likelihood over the state-space of s. In this case, too coarse of a raster will likely cause biased parameter estimates (having an effect analogous to measurement error in regression, we suspect). Too fine, however, creates concomitant effects on computing and memory requirements. Choice of raster size or spatial resolution is thus both a fundamental scientific question, but also very much a practical computing issue.

We developed the model in a discrete landscape which regarded potential trap locations and the covariate z(x) as being defined on the same set of points. In practice, trap locations may have been chosen independent of the definition of the raster and this does not pose any challenge or novelty to the model as we developed it. In that case, the covariate(s) need to be defined at each trap location. The model should be applicable also to covariates that are naturally continuous (e.g., distance-based covariates) although, in pratice, it will usually be sufficient to work with a discrete representation of such covariates.

In our formulation of the combined likelihood for RSF and SCR data, we assumed the data from capture–recapture and telemetry studies were independent of one another. This implies that whether or not an individual enters into one of the data sets has no effect on whether it enters into the other data set. We cannot foresee situations in which violation of this assumption should be problematic or invalidate the estimator under the independence assumption. In some cases it might so happen that some individuals appear in both the RSF and SCR data sets. In this case, ignoring that information should entail only an incremental decrease in precision because a slight bit of information about an individuals activity center is disregarded.

We used an RSF model for telemetry data that is most suitable for independent observations of resource selection. This would certaintly be reasonable if telemetry fixes are made far apart in time, or if the telemetry data are thinned. We note that this independence assumption is not an assumption of independent movement outcomes and, in general, active resource selection should lead to dependence in geographic location, regardless of how far apart in time the telemetry locations are. We don't know of a test of independence of selection outcomes. However, use of the independence model for non-independent data is probably only a minor problem for estimating density or other model parameters because we expect that the pixel use frequencies should remain unbiased in this case1. We imagine that precision should be over-stated for the parameters of the RSF model because the sample size is not reflecting the dependence of the observations. In general, however, it will be desirable to incorporate more general (or explicit) models of movement into the framework proposed here, so that SCR models can be used to improve inferences about animal movement, and because more explicit models may improve inferences about density obtained by capture–recapture studies. As we noted, our specific model of independence corresponds to a limiting case of the Gaussian process movement model (Johnson et al. 2008), but including the general RSF movement model for correlated data from Johnson et al. (2008) should not pose any difficulty in terms of constructing the combined SCR+RSF likelihood (but would contain one additional parameter).


We conclude that the key benefits of our combined SCR/RSF model is its ability to integrate realistic patterns of space usage directly into SCR models and avoid bias in estimating N and, secondarily, we are able to obtain RSF information from SCR alone. Therefore, our new class of integrated SCR/RSF models allows investigators to model how the landscape and habitat influence movement and space usage of individuals within their home range, using non-invasively collected capture–recapture data or capture–recapture data augmented with telemetry data. This should improve our ability to understand, and study, aspects of space usage and it might, ultimately, aid in addressing conservation-related problems such as reserve or corridor design. And, it should greatly expand the relevance and utility of SCR beyond simply its use for density estimation.


Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the US Government. Funding and support for the black bear study was provided by the New York State Department of Environmental Conservation.

  1. 1

    As a technical matter, we think regular movement models should exhibit an ergodic property analogous to standard Markov chains, time-series models and related dynamical systems.