SEARCH

SEARCH BY CITATION

Keywords:

  • kernel density estimation;
  • space use;
  • spatial statistics;
  • utilization distribution

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References
  1. Analyses based on utilization distributions (UDs) have been ubiquitous in animal space use studies, largely because they are computationally straightforward and relatively easy to employ. Conventional applications of resource utilization functions (RUFs) suggest that estimates of UDs can be used as response variables in a regression involving spatial covariates of interest.
  2. It has been claimed that contemporary implementations of RUFs can yield inference about resource selection, although to our knowledge, an explicit connection has not been described.
  3. We explore the relationships between RUFs and resource selection functions from a hueristic and simulation perspective. We investigate several sources of potential bias in the estimation of resource selection coefficients using RUFs (e.g. the spatial covariance modelling that is often used in RUF analyses).
  4. Our findings illustrate that RUFs can, in fact, serve as approximations to RSFs and are capable of providing inference about resource selection, but only with some modification and under specific circumstances.
  5. Using real telemetry data as an example, we provide guidance on which methods for estimating resource selection may be more appropriate and in which situations. In general, if telemetry data are assumed to arise as a point process, then RSF methods may be preferable to RUFs; however, modified RUFs may provide less biased parameter estimates when the data are subject to location error.

Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References

Resource utilization function (RUF) analyses (Marzluff et al. 2004) are widely employed in the study of animal space use and enjoy the advantages of being relatively intuitive and comparatively easy to implement. Based on the estimation of an individual-based ‘utilization distribution' (UD; e.g. Millspaugh et al. 2006), RUF analyses are commonly intended to obtain inference about the relationship between an animal or population's use of space and the underlying environmental niche. This desired inference is a critical component in the field of ecology (Krebs 1978). In linking the UD to the underlying environment, RUF analyses go beyond that of home range and core area (e.g. Wilson et al. 2010) estimation and also relate to resource selection analyses, where desired inference pertains to whether the use of resources is disproportionate to those available (Manly et al. 2002). One potential advantage of the RUF approaches is that they may improve selection inference when the telemetery data are subject to measurement error (Millspaugh et al. 2006).

Resource utilization function analyses hold an appeal because of their simplicity, but the specific connection in how they relate to resource selection functions (RSFs) has not been described. In this paper, we attempt to reconcile RUF analysis with RSF analysis and examine several sources of potential bias in doing so. We begin by describing the RUF and RSF analyses as they are traditionally employed. We then explore several potential sources of bias that could affect RUF analyses and use simulation to demonstrate our findings. Finally, we suggest a few simple diagnostics that could be employed in selection analyses and illustrate them using a real data set pertaining to the spatial ecology of mountain lions (Puma concolor) in Colorado, USA.

Resource utilization functions

The conventional perspective in animal space use studies is that the UD is a spatial probability distribution that gives rise to a spatial point process (i.e. the observed telemetry locations). That is, one assumes there is a surface over a spatial domain (inline image) of interest that specifies the likelihood (f) an animal will occur at any given location (s) in the domain. Thus, for a finite set of times at which an animal's location is observed, say t = 1,…,T, we have a statistical model for location where inline image.

The RUF procedure outlined by Marzluff et al. (2004) assumes that the probability distribution f (i.e. the UD) then depends on the underlying environment X (i.e. f(s)≡f(s|X,β)) and adopts a two-stage estimation approach for the coefficients β. The first step in the analysis is concerned with estimating the UD (with say, inline image), while the second stage links the UD to a set of underlying covariates X.

To estimate the UD, a wide variety of density estimation techniques can be employed to find inline image based on the telemetry data (inline image); however, we will focus on kernel density estimation (KDE), because (i) this is a commonly applied technique familiar to many animal ecologists and (ii) Marzluff et al. (2004) employed this approach in their seminal paper on the topic. It should be noted, however, that many of the following results would apply to RUFs based on any form of UD estimation technique.

In KDE, one takes a nonparametric approach to estimating f whereby for any location of interest inline image in the spatial domain inline image, the estimate of the UD is as follows:

  • display math(eqn 1)

where inline image, k represents the kernel (which we assume to be Gaussian) and the parameters inline image and inline image are bandwidth parameters that control the diffuseness of the kernel (Venables & Ripley 2002, Chapter 5). There are various ways to choose the bandwidth parameters, and these are well described in the literature (e.g. Silverman 1986). In practice, the UD, inline image, is estimated for a large but finite set of points (or grid cells, i = 1,…,m) in the spatial domain inline image for the purposes of graphical display or further use in a RUF model.

Consider, as an illustration, the situation where there is a single covariate of interest x and telemetry locations are simulated from inline image (Fig. 1). In this case, the coefficients were chosen to provide a positive relationship between the covariate and the UD (i.e. inline image, where inline image only has an effect on the total number of observed telemetry locations T). Figure 1 depicts a large-scale spatial pattern in the covariate where the telemetry data are constrained to the unit square region shown; this constraint serves as the ‘home range’ and could take any shape, but the rectangular shape is used here for display purposes only. We will show that the spatial pattern in the covariate, which is only a function of the spatial arrangement of the landscape, will prove to be an important factor in the spatially explicit models that follow.

A conventional RUF analysis typically proceeds by fitting a linear model with inline image as the response variable and inline image, a p × 1 vector, representing the covariates (i.e. environmental resources) at location inline image. That is, the second stage of the RUF analysis for an individual involves fitting the regression model:

  • display math(eqn 2)

for i=1,…,m and inline image, where the regression coefficients β control the linear relationship between the environmental covariates and the UD, and inline image corresponds to an intercept parameter that is not typically interpreted.

At the individual level, RUF analysis provides inference about the regression coefficients β in terms of significance and possibly subset selection, thereby illuminating the potential environmental influences on space use. In a population-level analysis, where telemetry data exist for multiple individuals (say, inline image for j = 1,…,J individuals) one would index the regression coefficients inline image such that they are labelled for each individual. Then, the focus shifts towards the expectation or variance in coefficient estimates inline image among individuals; for example, we may be interested in learning about inline image for all j = 1,…,J animals. In this latter case, the individual becomes the sample unit and the sample size J most heavily influences the uncertainty concerning inline image.

In implementing the RUF approach described previously, Marzluff et al. (2004) wisely noticed that there may be lurking forms of dependence in the regression errors inline image. They posited that such forms of dependence might arise from the smoothing induced by the KDE approach for estimating the UD (eqn (eqn 1)) [in addition to other possible sources of latent autocorrelation such as missing covariates in eqn (eqn 2)]. Marzluff et al. (2004) propose a geostatistical approach (Cressie 1993) that involves modelling the covariance structure between the errors inline image in a spatially explicit manner. A simple geostatistical model for the RUF analysis is the exponential spatial model given by:

  • display math(eqn 3)

where the numerator in the exponential refers to the Euclidean distance between cell i and cell l, and the denominator ϕ is a range parameter that controls the decay in the spatial structure of ɛ with distance. The two variance components inline image (nugget) and inline image (sill) account for the variance associated with a non-spatially structured and spatially structured source of error, respectively. In matrix notation, the model for the errors is then often expressed as ɛN(0,Σ), where inline image and the inline image element of the covariance matrix Σ is equal to (eqn 3). Often, the covariance matrix is written as inline image.

The conventional procedure used to fit geostatistical models to continuous spatial data involves a multi-step process of first (i) fitting the linear regression model assuming independent errors, then (ii) characterizing the spatial structure in the residuals using variogram estimation (Cressie 1993), and finally (iii) using generalized (or weighted) least squares (GLS) to estimate the regression coefficients (β) while taking into account the correlated errors. Other approaches such as maximum likelihood can also be used, but for simplicity, we retain the GLS method in our simulations.

Resource selection functions

Resource selection is the differential use of resources given those resources available. In describing the conventional approach for estimating RSFs (e.g. Manly et al. 2002; Johnson et al. 2006), we note that most recent applications of RSFs take a weighed distribution approach where the probability distribution of use inline image can be expressed as an updated distribution of availability inline image given the RSF g(x,β) which is usually expressed in an exponential form as g(x,β)= exp (xβ) (although other functional forms are possible, e.g., Lele & Keim 2006). This equivalence between use and the updated version of availability can be written as:

  • display math(eqn 4)

because the distribution of use inline image is not observed directly, a maximum likelihood approach can be taken to maximize a product over the right-hand-side of eqn (eqn 4) with respect to β:

  • display math(eqn 5)

Various tricks can be employed to maximize (eqn (eqn 5)) without having to analytically solve the integral in the denominator (e.g. Johnson et al. 2006; Lele 2009). The most common approach involves taking a ‘background’ sample (sometimes referred to as an availability sample) of locations from inline image and labelling those as zeros in a binary response vector with the ones corresponding to the observed telemetry locations. A logistic regression is then fit to the binary data using the covariates at all of the used and available locations. Under certain conditions, the parameter estimates inline image have been shown to be equivalent to those obtained by maximizing (eqn (eqn 5)). Incidentally, Warton and Shepherd (2010) and Aarts et al. (2012) have recently shown that maximizing (eqn (eqn 5)) is equivalent to maximizing the likelihood of an inhomogeneous spatial point process for the purpose of estimating β. Furthermore, Aarts et al. (2012) show that the required maximization can be achieved using a Poisson generalized linear model (GLM), with an offset term corresponding to availability.

To fit the Poisson GLM, one bins the telemetry locations into a large set of grid cells spanning the spatial domain inline image, and the resulting response variable inline image (for i = 1,…,m grid cells) consists of cell counts where the model is expressed as inline image, and a log link is used to model the intensities inline image:

  • display math(eqn 6)

where if the availability weights inline image are all equal (i.e. even availability within the region inline image), then this procedure becomes a regular Poisson log-linear regression of the cell counts on the covariates without weights. In what follows, we set all inline image; however, if inline image are set to be the area of the grid cells, then inline image can be interpreted as the average number points per unit area.

Reconciling RUFs and RSFs

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References

From one perspective, some might argue that the big difference between the RUF and RSF analyses is that a UD (i.e. inline image) is estimated prior to fitting the RUF model, whereas in the RSF approach, the UD is implicitly estimated as a function of the spatial covariates (i.e. inline image)) based on the data directly. On the other hand, even though it may not be obvious, the Poisson regression employed to fit the RSF is also estimating the UD first as a 2-D spatial histogram (i.e. inline image, for i = 1,…,m) at the scale of the underlying grid. In this sense, the grain size (i.e. inline image) of the cells in the grid over which the telemetry locations are summed is equivalent to the bandwidth parameters in the KDE for the RUF approach. That is, if inline image increases, then inline image becomes a smoother process over inline image (similar to increasing the bandwidth in the KDE). In both cases, as the smoothness in the estimated point process density increases, it yields a more biased density estimate; however, it also decreases the variance; therefore, the choice in the amount of smoothing to apply involves some notion of optimality.

Perhaps, a bigger concern is how the RUF fitting procedure affects the estimation of selection coefficients β, as these coefficients are typically our main focus. When population-level inference is desired, some have argued that the uncertainty associated with our knowledge of inline image for individual animals j = 1,…,J, is a minor concern compared with the sample size of individuals J (e.g. Otis & White 1999); for this reason, we focus only on bias in the estimation of inline image at the individual-level herein. That is, individual-level bias will have the biggest and most dubious effect on population-level inference when the number of telemetered individuals is large; thus, it is our focus here.

In an examination of RUFs and RSFs, we discovered the following important differences between methods when used to estimate resource selection:

  1. The use of inline image instead of inline image in conventional RUFs.
  2. The characterization of availability via the choice of inline image.
  3. The marginal smoothing induced by the UD KDE.
  4. The pattern of covariates in the spatial RUF.
  5. The possibility of location error in telemetry data.

We discuss each of these items in turn, providing some insight into how they play a role in the estimation of resource selection, and we also suggest some modifications for reconciling RUFs and RSFs.

The use of inline image instead of loginline image in conventional RUFs

Based on the assumptions of the Poisson point process model, the density f and intensity λ are related by: inline image, where the denominator is the expected number of points in the study area inline image. Thus, the Poisson intensities inline image governing the point process (i.e. telemetry data) are proportional to the densities inline image being modelled in the RUF analysis. That is,

  • display math(eqn 7)

where the ‘const’ term is related to the number of telemetry locations T in the data set. Thus, because of the two-stage fitting procedure in the RUF analysis, it would be considered an approximation to the RSF analysis if the log transformation was applied to the estimated density function inline image (at least in terms of estimating β). That is, if the second stage (eqn (eqn 2)) of the RUF model was modified such that

  • display math(eqn 8)

where inline image implicitly includes ‘-log(const),’ then the main difference between the RSF (eqn (eqn 6)) and the RUF (eqn (eqn 8)) would be the Poisson instead of Gaussian error, respectively. Furthermore, from a practical perspective, the log transformation expands the support of the response variable in the RUF model (eqn (eqn 8)) from the positive to the real numbers. Thus, in the remainder of the article, we refer to eqn (eqn 8) as the RUF model and examine its properties.

The characterization of availability via the choice of inline image

Recall that resource selection is the degree of use given resource availability. If RUFs are approximations to RSFs, then how does availability play a role in RUF analyses? A surprising amount of variation in the estimation of β can be observed by simply changing how the background sample is taken. This background sample provides a Monte Carlo approximation of the integral in the weighed distribution (eqn (eqn 4), and associated point process model), and the spatial extent of the integral (inline image) is what controls availability in the RSF under the assumption of uniform availability in that region. Millspaugh et al. (2006) recommend defining inline image based on the UD itself. We agree that areas outside of the natural availability to the individual animal should not be considered in RSF analyses. The region of potential space use or home range is typically thought to be a function of external and/or internal biological forces either constraining (e.g. territorial behaviour) or attracting (e.g. central place foragers) movement. Thus, assuming uniform availability over inline image, both RSF and RUF analyses account for availability simply by limiting the spatial support of the response variable in the model in question. If one takes a Poisson GLM approach to fitting a RSF, then the extent of the grid over which the telemetry locations are counted acts as the spatial support in the model and, in the case of the RUF (eqn (eqn 8)), it is the grid over which the UD is estimated. Given that overly conservative availability extents can cause a dramatic bias in the results, the recommendation by Millspaugh et al. (2006) to use a large isopleth of the estimated UD is sensible.

The marginal smoothing induced by the UD KDE

As Marzluff et al. (2004) point out, there is an inherent marginal (i.e. not explicitly considering the covariates) smoothing that is induced in the estimation (eqn (eqn 1)) of the point process density inline image based on the telemetry locations inline image. It is not easy to see how this smoothing manifests itself when the log UD (eqn (eqn 8)) is used as a response variable in the RUF model because of the complex nature of the KDE procedure. However, we can write out a heuristically similar model that is based on smoothing the response variable directly. In this case, to simplify the notation, let y represent a non-smoothed representation of the log UD, then suppose the log UD is generated as inline image. Now, if we apply a linear smoother to the log UD (Wy) that is based on a weighing of the y at all locations, then using the properties of a multivariate normal distribution, we have the correct model for the smoothed log UD:

  • display math(eqn 9)

Using similar notation, the RUF model (eqn (eqn 8)) is akin to inline image. Thus, if the log UD is obtained via marginal smoothing of the point process, the RUF model (eqn (eqn 8)) is misspecified. In fact, a more appropriate specification would be similar to that presented in eqn (eqn 9). The problem is that we do not know the exact form of the smoother matrix W, and it will vary with the choice of marginal density estimator.

The effect of using the misspecified RUF model (eqn (eqn 8)) on the estimation of the selection coefficients β is that the smoothing operator will be applied to the log UD but not to the mean field , hence inducing a bias in inline image. This implies that regardless of whether ordinary least squares (OLS) or GLS is used to estimate β, we will obtain biased selection coefficients. A possible remedy for this situation, because the exact form of W is unknown, is to try to induce a similar operator on by simply smoothing the covariates X before fitting the model; this yields the model

  • display math(eqn 10)

This would not yield the correct model (eqn (eqn 9)), but it would be an improvement. If the post hoc smoother inline image could be written as a linear smoother, then it could also be easily employed in the covariance structure yielding the model inline image. Alternatively, one could assume that a second-order covariance matrix estimated from the data in the geostatistical sense would serve as an approximation. This latter modification would yield:

  • display math(eqn 11)

a model quite similar to the spatial RUF proposed by Marzluff et al. (2004), but with covariates smoothed to the same degree as the log UDs.

The pattern of covariates in the spatial RUF

The previous sections show, at least heuristically, how a modified version of the RUF analysis could be considered as an approximation of the RSF analysis. Continuing the example using our simulated data from Fig. 1, we fit the linear model in eqn (eqn 8) assuming independent errors and then estimated the variogram and modelled it using the exponential form of spatial structure (eqn (eqn 3)) previously discussed. The resulting variogram fit (Fig. 2) indicated that residual autocorrelation exists in our data even though it was simulated based on the relationship with the covariate alone. As Marzluff et al. (2004) suggest, this residual autocorrelation is likely due to the smoothing induced by the KDE of the UD. Because latent spatial autocorrelation exists in our simulated data, we would be wise to account for it so that we may obtain accurate inference about the parameters β in the RUF.

We make a slight modification to the specification of the spatially explicit RUF model such that, using matrix notation, we now have:

  • display math(eqn 12)

where each of the vectors is concatenated over all cells in inline image, and the original error vector inline image is now split into two pieces ɛ = Hz+η; the first (i.e. Hz) controlling the spatial dependence and the second (i.e. η) accounting for any unstructured error. In fact, the spatially correlated errors arise from a normal distribution inline image where the precision matrix τQ is the inverse of the former covariance matrix (i.e. inline image) and the unstructured errors η are independent and identically normal such that inline image. This reparameterization makes it easier to illustrate how second-order spatial dependence can impose a bias on the estimates of β.

From eqn (eqn 12), it is apparent that the model contains two sets of covariates (i.e. X and H). This implies that the covariates (i.e. columns) in H are spatial maps that may influence the log UD depending on a new set of regression coefficients z. It can be shown that these ‘spatial maps’, acting as unobserved covariates, are actually eigenvectors of the aforementioned Q in the precision matrix, where Q=HΛH′ (Clayton, 1993; Paciorek 2010). In other words, the spatial structure imposed by the geostatistical model (eqn (eqn 3)) implies that there are an entire set of covariates in our model aside from those measured environmental variables X! The parameters z then act as regression coefficients that control the relative importance of the latent covariates in H for predicting the log UD. Further, it can be shown that inline image are random effects, where Λ is the diagonal eigenvalue matrix resulting from the spectral decomposition of Q. The subtle but important consequence of having additional covariates H in the model is that they may be collinear with the known environmental covariates X. This is potentially a big problem that is well described in the statistical literature (e.g. Clayton, 1993; Reich et al. 2006; Hodges & Reich 2010; Paciorek, 2010), although has received little attention in the ecological literature.

In our continued example with the simulated data shown in Fig. 1, we have computed the implied spatial covariate matrix H based on the variogram fit in Fig. 2 and illustrate the correlation with our covariate x using a few of the most important eigenvectors in H (Fig. 3). These three eigenvectors represent the second, third and fourth most important spatial patterns implied by the autocorrelation (Fig. 2) in the residuals of our simulated data. As implied spatial covariates in H, each indicates an absolute correlation of approximately 0·5 with our simulated covariate x.

image

Figure 1. (a) Spatial covariate x, (b) simulated telemetry locations inline image, for t = 1,…,400, and (c) the log transformed KDE representing the estimated UD based on the simulated data.

Download figure to PowerPoint

image

Figure 2. Semi-variogram (points) and weighted least squares fit (line) of the exponential covariance model (eqn (eqn 3)) resulting from the residuals of the linear regression using our simulated data.

Download figure to PowerPoint

image

Figure 3. (a) Second most important eigenvector in H; correlation with x is 0·503 (b) third most important eigenvector in H; correlation with x is −0·504, and (c) fourth most important eigenvector in H; correlation with x is 0·468.

Download figure to PowerPoint

Several potential modifications have been suggested to alleviate the bias induced by collinearity between the covariates and spatially correlated errors (e.g. Reich et al. 2006; Hodges & Reich 2010; Hughes & Haran 2013); however, each of them would ‘correct’ the bias in the selection coefficients such that it is exactly equal to the non-spatial model fit using OLS. The suggested modifications (i.e. spatially restricted regression) have the additional effect of appropriately adjusting the variance of the estimators, but because we are primarily concerned with the bias here, we refer the interested reader to the cited literature herein.

The effect of location error in telemetry data

Hepinstall et al. (2004) hint that RUF methods were developed as an ad hoc procedure for fitting point process models. Before it was recognized that resource selection parameters could be estimated using readily available GLM fitting software, the required integration in the point process likelihood (eqn (eqn 4)) made it challenging to fit point process models directly. If RSF methods are now just as accessible as RUFs, given their relationship, then what, if anything, do we gain when analysing telemetry data using the RUF approach? In this light, Millspaugh et al. (2006) claim that the RUFs are better able to handle measurement error in the telemetry data (i.e. with less bias). It seems reasonable that the marginal smoothing would help account for noise in the data; thus, using simulation, we evaluate this claim in the following section.

Data analysis

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References

Simulation study

We constructed a large simulation study to empirically verify the differences among the various methods for estimating resource selection. In doing so, we used a range of covariates (scaled to have mean zero and variance 1 on a 20 × 20 regular grid) from small-scale to large-scale, we varied the sample size from 25 to 400 independent telemetry points resulting from the intensity surface defined by inline image, we varied the bandwidth in the KDE of the UD, and we used 3 different levels of measurement error by adding Gaussian noise to the simulated telemetry locations (with a variance of 0, 0·05 and 0·1, respectively). Further, we used a range of selection coefficient values from 0 to 2 and compared each of the following estimation procedures where the modified RUFs incorporate a degree of covariate smoothing that best improves the model fit (via inline image), and when we use the term ‘spatial’ here we are referring to the explicit spatially structured covariance version of the model:

  1. PGLM: Poisson GLM described in Section Resource selection functions and eqn (eqn 6).
  2. NSRUF: non-spatial RUF described in Section The use of inline image instead of loginline image in conventional RUFs and eqn (eqn 8) assuming independent and identically distributed errors inline image.
  3. SRUF: spatial RUF described in Section The use of inline image instead of loginline image in conventional RUFs and eqn (eqn 8) assuming correlated errors following the model (eqn (eqn 3)).
  4. NSMRUF: non-spatial modified RUF described in Section The marginal smoothing induced by the UD KDE and eqn (eqn 10) assuming independent and identically distributed errors inline image.
  5. SMRUF: spatial modified RUF described in Section The marginal smoothing induced by the UD KDE and eqn (eqn 11) assuming correlated errors following the model (eqn (eqn 3)).

All analyses were carried out using the R Statistical Computing Environment (R Core Team, 2012; with R functions ‘glm, ‘variog’ and ‘variofit’ from the ‘geoR’ package; Ribeiro & Diggle 2001). A subset of the results from the simulation study is presented in Table 1. The biases reported in Table 1 were approximated using 1000 simulations of point processes with a sample size of 100, inline image, large-scale covariates (i.e. range of spatial structure was approximately two- thirds of the maximum distance in the spatial domain), the plug-in bandwidth for the KDE estimate of the UD and over three different levels of location error in the data (i.e. none, small and moderate). To maintain the same sample size in each realization of the point process, we used an inflated value for inline image and then thinned the simulated points. An alternative simulation approach would be to choose inline image such that the desired sample size was merely the expected number of points (i.e. inline image), but this would not maintain a constant sample size across simulations.

Table 1. The first three columns display the results of the simulation study showing the bias incurred when estimating the resource selection coefficient inline image using each of the methods under varying amounts of location error. The small and large location error corresponds to an additive symmetrical error with standard deviation of inline image and inline image of the maximum distance in the spatial domain, respectively. Bias values close to zero indicate the method is relatively unbiased for estimating resource selection. The last column shows the resource selection parameter estimates under the different methods. It is important to note that the last column displays the estimates themselves, whereas the previous three columns represent bias values
 BiasEstimate
Amount of location errorMountain lion
NoneSmallModerate inline image
PGLM−0·008−0·246−0·402−0·242
NSRUF−0·288−0·343−0·428−0·109
SRUF−0·931−0·943−0·9630·001
NSMRUF−0·007−0·077−0·182−0·317
SMRUF−0·103−0·175−0·361−0·145

The results presented in Table 1 hold generally across the full range of simulations performed and are representative of a broad range of scenarios. Overall, the most obvious pattern we notice is that the SRUF is the most biased method for estimating resource selection across all scenarios; we attribute this to two sources of bias: the marginal smoothing of the UD and the potential spatial confounding. The NSMRUF performs the best across all scenarios shown in Table 1; however, it was not unbiased in all simulations (not shown here), but it was always the second best method compared with the PGLM. In cases where there is measurement error, the NSMRUF and SMRUF do quite well in terms of bias, although no method stays completely unbiased when location error is present. The SRUF and SMRUF appear to pick up an additional source of bias that does not effect the non-spatial estimation procedures. Based on the literature and the high degree of correlation with the second-order spatial error (which was nearly always greater than 0·6, indicating collinearity), we suspect this additional bias may be caused by spatial confounding (e.g. Hodges & Reich 2010). Overall, these simulations support the arguments made in Section Reconciling RUFs and RSFs concerning the differences between methods and possible sources of bias.

Mountain lion data

To illustrate a diagnostic approach for performing a resource selection analysis using non-simulated data, we consider an individual mountain lion and a single covariate. The telemetry data are comprised of global positioning system (GPS) locations at a fairly regular fix interval of approximately 3 h in an ongoing Colorado Parks and Wildlife (CPW) monitoring effort. We focused on a single individual (# AF50, an adult female) to demonstrate a potential diagnostic procedure for determining the best resource selection approach for inference. We thinned the original data, keeping only those points greater than 10 days apart to alleviate any concerns due to temporal autocorrelation (e.g. Swihart & Slade 1985). We used the topographical covariate of solar exposure (i.e. modified Beers’ aspect transform; Beers 1966) for the analyses as this is a potentially important resource on the Colorado Front Range for these large carnivores (Fig. 4).

image

Figure 4. The (a) mountain lion KDE log UD, (b) spatial covariate x (exposure), (c) smoothed covariate, and (d) spatial eigenvector that correlates most strongly with the covariate (correlation: −0·5).

Download figure to PowerPoint

In using each of the methods to estimate resource selection, we found great variability in the point estimates for the parameter inline image (Table 1; far right column). The SRUF demonstrated similar results as in our simulations, estimating inline image far from any of the other estimates (and positive), while the spatial modified RUF estimate for inline image seemed to improve (but was still not equivalent with the other methods). There did appear to be a slight difference between the estimates using the NSMRUF and the PGLM. Both performed well in our simulations where no location error was present. Given that these data were GPS telemetry locations, we would not expect location inaccuracy at the scale of our covariates. However, because the relationship between exposure and the UD may not be causal (which is not explored in our simulations), there may be missing covariates that could help explain resource selection. As a substitute for measurement error, this type of misspecification error may be accounted for in the RUF (but not in the RSF), although this is mostly speculative. It should be noted that the NSMRUF, SMRUF and the PGLM indicate that there is a negative effect of exposure on selection, implying that this individual is selecting for more protected aspects.

Both functions are trivial to estimate, but with the added smoothing for the covariates in the NSMRUF and SMRUF, the GLM is slightly more straightforward. It is clear that using the spatial models for this data set is not advised due to the additional bias. Further simulation based on the exposure covariate and a range of inline image values encompassing those estimated here could provide additional guidance as to whether the RUF or RSF provides less biased resource selection inference. However, in this scenario, with no obvious source of measurement error, we would choose the RSF point process model (i.e. PGLM) for inference.

Conclusion

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References

In an examination of the properties of both RUFs and RSFs, we find that generally the RSF is preferred because it is slightly easier to implement and yields unbiased inference about selection coefficients when no measurement error exists in the telemetry data. However, we note that when there is location uncertainty in the data, a modified version of the RUF can outperform the traditional RSF in terms of less bias in the estimation of selection coefficients. This advantage was mentioned by Millspaugh et al. (2006) but was not demonstrated, nor was the RUF reconciled with the RSF to provide inference about the same coefficients.

The residuals resulting from RUF models will typically indicate latent spatial autocorrelation, and normally, it would be a good idea to account for this; however, when using large-scale covariates, there is a high likelihood of multi-collinearity between the covariates and second-order spatial structure (i.e. spatial eigenvectors) inducing a bias in the resource selection coefficients. Thus, the spatial RUFs do not seem to provide valid inference about resource selection, at least in the conventional sense and in the range of scenarios we simulated.

Overall, it is evident that the original RUFs do, in fact, attempt to model some form of resource selection but that the coefficients obtained could not be expected to be comparable with those arising from fitting an RSF without some modification. Perhaps, the biggest finding we offer, aside from the potential spatial confounding induced by the second-order structure in the SRUF and SMRUF, is that the RUF approach can be modified (NSMRUF) such that it is a better estimator of resource selection (in terms of bias) than the traditional RSF when the data are subject to measurement error. This may be valuable in the analysis of VHF or ARGOS satellite telemetry data (which usually have more location uncertainty than GPS data). Although we have focused on simpler models herein, an alternative framework could be constructed to explicitly account for any measurement error when making inference about resource selection (e.g. Johnson et al. 2008).

Finally, as a reminder, we note that the general RUF approach requires a two-stage procedure where the ‘response’ variable (i.e. the KDE) is first estimated using the original data and then it is statistically linked to covariates in a second-stage analysis. Like with all two-stage analyses, a potential shortcoming of the approach is that the uncertainty associated with the estimated density surface in the first stage is not accomodated in the second-stage analysis. One way to remedy this would be to employ either a bootstrapping, data augmentation or multiple imputation procedure to help account for any uncertainty in the KDE; however, at that point, one could argue that the RUF method may have lost its simple and straightforward appeal.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References

Funding for this project was provided by Colorado Parks and Wildlife (#1201). The use of trade names or products does not constitute endorsement by the U.S. Government.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Reconciling RUFs and RSFs
  5. Data analysis
  6. Conclusion
  7. Acknowledgements
  8. References
  • Aarts, G., Fieberg, J. & Matthiopoulos, J. (2012) Comparative interpretation of count, presence-absence and point methods for species distribution models. Methods in Ecology and Evolution, 3, 177187.
  • Beers, T., Dress, P. & Wensel, L. (1966) Aspect transformation in site productivity research. Journal of Forestry, 64, 691692.
  • Clayton, D., Bernardinelli, L. & Montomoli, C. (1993) Spatial correlation in ecological analysis. International Journal of Epidemiology, 22, 11931202.
  • Cressie, N.A.C. (1993) Statistics for Spatial Data. John Wiley & Sons, Inc., New York, USA.
  • Hepinstall, J.A., Marzluff, J.M., Handcock, M.S. & Hurvitz, P. (2004) Incorporating resource utilization distributions into the study of resource selection: dealing with spatial autocorrelation. Resource Selection Methods and Applications (ed. S. Huzurbazar), pp. 1219. Omnipress, Madison, WI.
  • Hodges, J.S. & Reich, B.J. (2010) Adding spatially-correlated errors can mess up the fixed effect you love. The American Statistician, 64, 325334.
  • Hughes, J. & Haran, M. (2013) Dimension reduction and alleviation of confounding for spatial generalized linear mixed models. Journal of the Royal Statistical Society: Series B, Statistical Methodology, 75, 139159.
  • Johnson, C.J., Nielson, S.E., Merrill, E.H., McDonald, T.L. & Boyce, M.S. (2006) Resource selection functions based on use-availability data: theoretical motivation and evaluation methods. Journal of Wildlife Management, 70, 347357.
  • Johnson, D.S., Thomas, D.L., Ver Hoef, J.M. & Christ, A. (2008) A general framework for the analysis of animal resource selection from telemetry data. Biometrics, 64, 968976.
  • Krebs, C. (1978) Ecology: The Experimental Analysis of Distribution and Abundance. Harper & Row Publishers Inc., New York, USA.
  • Lele, S.R. & Keim, J.L. (2006) Weighted distributions and estimation of resource selection probability functions. Ecology, 87, 30213028.
  • Lele, S.R. (2009) A new method for estimation of resource selection probability function. The Journal of Wildlife Management, 71, 122127.
  • Manly, B.F.J., McDonald, L.L., Thomas, D.L., McDonald, T.L. & Erickson, W.P. (2002) Resource Selection by Animals. Kluwer Academic Publishers, Dordrecht.
  • Marzluff, J.M., Millspaugh, J.J., Hurvitz, P. & Handcock, M.S. (2004) Relating resources to a probabilistic measure of space use: forest fragments and stellar's jays. Ecology, 85, 14111427.
  • Millspaugh, J.J., Nielson, R.M., McDonald, L.L., Marzluff, J.M., Gitzen, R.A., Rittenhouse, C.D., Hubbard, M.W. & Sheriff, S.L. (2006) Analysis of resource selection using utilization distributions. Journal of Wildlife Management, 70, 384395.
  • Otis, D.L. & White, G.C. (1999) Autocorrelation of location estimates and the analysis of radiotracking data. Journal of Wildlife Management, 63, 10391044.
  • Paciorek, C.J. (2010) The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Statistical Science, 25, 107125.
  • R Core Team. (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
  • Reich, B.J., Hodges, J.S. & Zadnik, V. (2006) Effects of residual smoothing on the posterior of the fixed effects in disease-mapping models. Biometrics, 62, 11971206.
  • Ribeiro, P.J. & Diggle, P.J. (2001) geoR: a package for geostatistical analysis. R-NEWS, 1, 1518.
  • Silverman, B.W. (1986) Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.
  • Swihart, R.K. & Slade, N.A. (1985) Testing for independence of observations in animal movements. Ecology, 66, 11761184.
  • Venables, W.N. & Ripley, B.D. (2002) Modern Applied Statistics with S, 4th edn. Springer, New York.
  • Warton, D.I. & Shepherd, L.C. (2010) Poisson point process models solve the “pseudo-absence problem” for presence-only data in ecology. The Annals of Applied Statistics, 4, 13831402.
  • Wilson, R.R., Hooten, M.B., Strobel, B.N. & Shivik, J.A. (2010) Accounting for individuals, uncertainty, and multi-scale clustering in core area estimation. Journal of Wildlife Management, 74, 13431352.