Resource utilization function analyses hold an appeal because of their simplicity, but the specific connection in how they relate to resource selection functions (RSFs) has not been described. In this paper, we attempt to reconcile RUF analysis with RSF analysis and examine several sources of potential bias in doing so. We begin by describing the RUF and RSF analyses as they are traditionally employed. We then explore several potential sources of bias that could affect RUF analyses and use simulation to demonstrate our findings. Finally, we suggest a few simple diagnostics that could be employed in selection analyses and illustrate them using a real data set pertaining to the spatial ecology of mountain lions (Puma concolor) in Colorado, USA.
Resource utilization functions
The conventional perspective in animal space use studies is that the UD is a spatial probability distribution that gives rise to a spatial point process (i.e. the observed telemetry locations). That is, one assumes there is a surface over a spatial domain () of interest that specifies the likelihood (f) an animal will occur at any given location (s) in the domain. Thus, for a finite set of times at which an animal's location is observed, say t = 1,…,T, we have a statistical model for location where .
The RUF procedure outlined by Marzluff et al. (2004) assumes that the probability distribution f (i.e. the UD) then depends on the underlying environment X (i.e. f(s)≡f(s|X,β)) and adopts a two-stage estimation approach for the coefficients β. The first step in the analysis is concerned with estimating the UD (with say, ), while the second stage links the UD to a set of underlying covariates X.
To estimate the UD, a wide variety of density estimation techniques can be employed to find based on the telemetry data (); however, we will focus on kernel density estimation (KDE), because (i) this is a commonly applied technique familiar to many animal ecologists and (ii) Marzluff et al. (2004) employed this approach in their seminal paper on the topic. It should be noted, however, that many of the following results would apply to RUFs based on any form of UD estimation technique.
In KDE, one takes a nonparametric approach to estimating f whereby for any location of interest in the spatial domain , the estimate of the UD is as follows:
- (eqn 1)
where , k represents the kernel (which we assume to be Gaussian) and the parameters and are bandwidth parameters that control the diffuseness of the kernel (Venables & Ripley 2002, Chapter 5). There are various ways to choose the bandwidth parameters, and these are well described in the literature (e.g. Silverman 1986). In practice, the UD, , is estimated for a large but finite set of points (or grid cells, i = 1,…,m) in the spatial domain for the purposes of graphical display or further use in a RUF model.
At the individual level, RUF analysis provides inference about the regression coefficients β in terms of significance and possibly subset selection, thereby illuminating the potential environmental influences on space use. In a population-level analysis, where telemetry data exist for multiple individuals (say, for j = 1,…,J individuals) one would index the regression coefficients such that they are labelled for each individual. Then, the focus shifts towards the expectation or variance in coefficient estimates among individuals; for example, we may be interested in learning about for all j = 1,…,J animals. In this latter case, the individual becomes the sample unit and the sample size J most heavily influences the uncertainty concerning .
The conventional procedure used to fit geostatistical models to continuous spatial data involves a multi-step process of first (i) fitting the linear regression model assuming independent errors, then (ii) characterizing the spatial structure in the residuals using variogram estimation (Cressie 1993), and finally (iii) using generalized (or weighted) least squares (GLS) to estimate the regression coefficients (β) while taking into account the correlated errors. Other approaches such as maximum likelihood can also be used, but for simplicity, we retain the GLS method in our simulations.