Addressing challenges when studying mobile or episodic species: hierarchical Bayes estimation of occupancy and use

Authors

  • Rua S. Mordecai,

    1. Warnell School of Forestry and Natural Resources, University of Georgia, Athens, GA 30602, USA
    Search for more papers by this author
    • Present addresses: U.S. Fish & Wildlife Service, South Atlantic Landscape Conservation Cooperative, Raleigh, NC 27699–1701.

  • Brady J. Mattsson,

    Corresponding author
    1. Warnell School of Forestry and Natural Resources, University of Georgia, Athens, GA 30602, USA
      Correspondence author. E-mail: bmattsson@usgs.gov
    Search for more papers by this author
    • U.S. Geological Survey, Patuxent Wildlife Research Center, Laurel, MD 20708, USA

  • Caleb J. Tzilkowski,

    1. National Park Service, Eastern Rivers and Mountains Network, Forest Resources Building, University Park, PA 16802, USA
    Search for more papers by this author
  • Robert J. Cooper

    1. Warnell School of Forestry and Natural Resources, University of Georgia, Athens, GA 30602, USA
    Search for more papers by this author

Correspondence author. E-mail: bmattsson@usgs.gov

Summary

1. Understanding the distribution and ecology of episodic or mobile species requires us to address multiple potential biases, including spatial clustering of survey locations, imperfect detectability and partial availability for detection. These challenges have been addressed individually by previous modelling approaches, but there is currently no extension of the occupancy modelling framework that accounts for all three problems while estimating occupancy (ψ), availability for detection (i.e. use; θ) and detectability (P).

2. We describe a hierarchical Bayes multi-scale occupancy model that simultaneously estimates site occupancy, use, and detectability, while accounting for spatial dependence through a state-space approach based on repeated samples at multiple spatial or temporal scales. As an example application, we analyse the spatiotemporal distribution of the Louisiana waterthrush Seiurus motacilla with respect to catchment size and availability of potential prey based on data collected along Appalachian streams of southern West Virginia, USA. In spring 2009, single observers recorded detections of Louisiana waterthrush (henceforth, waterthrush) within 75 m of point-count stations (i.e. sites) during four 5-min surveys per site, with each survey broken into 1-min intervals.

3. Waterthrushes were widely distributed (ψ range: 0·6–1·0) and were regularly using (θ range: 0·4–0·6) count circles along forested mountain streams. While accounting for detection biases and spatial dependence among nearby sampling sites, waterthrushes became more common as catchment area increased, and they became more available for detection as the per cent of the benthic macroinvertebrates that were of the orders Ephemeroptera, Plecoptera or Trichoptera (EPT) increased. These results lend some support to the hypothesis that waterthrushes are influenced by instream conditions as mediated by watershed size and benthic macroinvertebrate community composition.

4.  Synthesis and applications. Although several available modelling techniques provide estimates of occupancy at one scale, hierarchical Bayes multi-scale occupancy modelling provides estimates of distribution at two scales simultaneously while accounting for detection biases and spatial dependencies. Hierarchical Bayes multi-scale occupancy models therefore hold significant potential for addressing complex conservation threats that operate at a landscape scale (e.g. climate change) and probably influence species distributions over multiple scales.

Introduction

Much ecological research seeks to understand drivers of species distributions across space and time. Examples include studies of metapopulation ecology (Hanski 1994), population viability (Beissinger & Westphal 1998), community composition and dynamics (Mordecai, Cooper & Justicia 2009; Zipkin, Dewan & Royle 2009), resource selection (MacKenzie 2006) and disease spread (Thompson 2007). Modelling distribution of species based on presence–absence data using occupancy models offers flexibility in addressing such diverse questions with relatively simple sampling designs that account for detectability (Mackenzie & Royle 2005). Understanding the distribution and ecology of episodic or mobile species, however, requires us to address multiple challenges related to sampling biases (Pollock et al. 2004; Kéry & Schmidt 2008; Kéry et al. 2009). In particular when studying vocal species, challenges include (i) individuals may be more detectable in acoustically favourable environments (Pacifici et al. 2008; Mattsson & Marshall 2009), (ii) individuals periodically become unavailable for detection within a sample unit (Farnsworth et al. 2002; Diefenbach et al. 2007; Rota et al. 2009) and (iii) spatial clustering of survey locations may induce spatial dependence among nearby points (for review see Campomizzi et al. 2008).

The first challenge (imperfect detection) can be addressed by simultaneously estimating occupancy and detection probabilities based on repeated detection/non-detection data (MacKenzie et al. 2002; Mattsson & Marshall 2009). If unaddressed, variation in detectability can produce misleading inferences regarding species distribution (Williams, Nichols & Conroy 2002; Gu & Swihart 2004). Detection bias has been recognized and addressed in several applications that investigate distributions of species (Wintle et al. 2005; O’Connell et al. 2006; Bailey et al. 2007; Kéry & Schmidt 2008).

A second challenge is that periodic unavailability for detection due to species movement or phenology may violate the closure assumption of occupancy models and may generate biased estimates of patch occupancy (Pollock et al. 2004; Kéry & Schmidt 2008). The robust design is a sampling design comprised of nested primary and secondary surveys and allows application of models that account for (i) variation in detectability during each secondary survey and (ii) violation of the closure assumption among primary surveys (Pollock 1982). In addition to providing a means to account for potential biases, the robust design offers an opportunity to distinguish occupancy at two nested scales (Fig. 2). At a coarser scale, we can estimate the probability that a site is usable (i.e. that a species may use the site), which we define here as occupancy (ψ). Given species occupancy at the coarser scale, we can then estimate the probability that a species uses a site during each primary survey, which we define here as use (θ). Taken together, this multi-scale modelling approach allows examination of species distribution at two scales simultaneously.

Figure 2.

 Comparison of example single-scale (a) and multi-scale (b) occupancy designs with either temporal or spatial replication of subsamples. A square represents a site, surveys take place at sites (here, point count circles), and ψi is the probability that site i is occupied by a species. In single-scale occupancy design (a), P is the probability of detecting that species during a subsample (e.g. minute of point count) given the site is occupied. In multi-scale occupancy design (b), θ is the probability that the species uses the site on a specific survey given the site is occupied, and P is the probability of detecting that species during a subample given the site is used during a specific survey.

Multi-scale occupancy models may be fit to detection data collected using the robust design, and they simultaneously provide estimates of occupancy, use and detection (Mordecai 2007; Nichols et al. 2008). As such, these models may be particularly useful for investigators that are interested in examining patch occupancy across multiple primary surveys and during each of >2 primary surveys (e.g. days or weeks). Use (i.e. availability for detection) and detection are often separated when estimating local species abundance (for review see Johnson 2008), and such estimates can be provided by generalized Horvitz–Thompson estimators (Pollock et al. 2004; Diefenbach et al. 2007). In contrast, distinguishing patterns in species distribution (i.e. occupancy) from use and detection has received little attention (Mordecai 2007; Nichols et al. 2008).

The third, and perhaps the least addressed, challenge is that survey locations are often clustered and conspecifics may aggregate or occupy areas covering multiple sampling locations, which induces spatial dependence and therefore underestimation of variation among nearby sample units (Sauer, Link & Royle 2005). A solution to this dependence is to apply a random effect that references a coarser, aggregate sampling unit when predicting distribution at finer spatial scales (Royle & Dorazio 2006; Royle et al. 2007). Although this may be accomplished through maximum-likelihood estimation and linear mixed modelling, hierarchical Bayes models offer flexible and robust approaches to modelling distributions of species based on sparse detections while accounting for spatiotemporal dependencies and detectability (Royle & Dorazio 2008, pp. 106–124). Applying a hierarchical Bayes approach to multi-scale occupancy models offers a robust and extensible solution for dealing with multiple challenges of studying nested patterns of distribution or resource use by mobile or episodic species.

Here, we describe a multi-scale site occupancy model that integrates existing occupancy modelling approaches by simultaneously estimating site occupancy (ψ) and use (θ) while accounting for detectability (P) and spatial dependence through the use of random effects. In particular, this model addresses challenges to studying episodic or mobile species by employing a Bayesian state-space modelling approach and is an extension of existing multi-scale occupancy models that assume no spatial dependence among sample units (Mordecai 2007; Nichols et al. 2008). We first demonstrate that multi-scale occupancy models are a generalization of single-scale occupancy models, and then we describe sampling designs necessary to simultaneously estimate occupancy and temporal or spatial patterns of use while accounting for detectability. We then present an analysis based on bird data collected in southern West Virginia as part of a long-term monitoring programme administered by the National Park Service (NPS). In particular, we examine occupancy and temporal patterns of use by the Louisiana waterthrush Seiurus motacilla Vieillot, a riparian obligate passerine, based on catchment area and a measure of benthic macroinvertebrate community composition. Finally, we discuss the importance and potential extensions of hierarchical Bayes multi-scale occupancy modelling for addressing many questions in ecology, management and conservation biology.

Materials and methods

Study area and field protocol

As part of the NPS Inventory and Monitoring Programme in the Eastern Rivers and Mountains Network, a long-term streamside bird monitoring programme was developed, in part, to monitor the distribution of the Louisiana waterthrush (henceforth, waterthrush; Mattsson & Marshall 2010) which has been demonstrated to be an indicator of biotic integrity in headwater streams (Mattsson & Cooper 2006; Mulvihill, Newell & Latta 2008). This riparian obligate warbler consumes primarily benthic macroinvertebrates along stream margins, and their mostly linear territories typically extend 250–300 m along stream networks in this region (Mattsson et al. 2009). Watershed conditions such as catchment area, topography and land cover can affect the community composition of potential prey for waterthrushes (Klemm et al. 2002; Roy et al. 2003; King et al. 2005), which in turn may affect waterthrush distribution. Waterthrush monitoring is one element of a larger ‘Vital Signs’ monitoring effort in the network that includes monitoring of water quality and benthic macroinvertebrates (Marshall & Piekielek 2007).

From a total of 80 candidates, 28 tributary watersheds (2nd–3rd Strahler stream order; Strahler 1952) were selected for waterthrush monitoring in two National Parks (i.e. New River Gorge National River and Gauley River National Recreation Area) of southern West Virginia (37°57′ N, 81°4′ W`; Fig. 1; Mattsson & Marshall 2010). These parks are characterized by steep, forested 1st–3rd order drainages that flow into larger rivers that bisect each park. Watersheds within these parks were selected using a stratified randomization based on underlying features, including watershed size, geology and land ownership. Watersheds were delineated and catchment area (i.e. the area of land that drains to a focal point in the landscape, also known as watershed area) within each watershed was estimated based on a 10-m digital elevation model (US Geological Survey 2004) using ArcGIS 9·1 (ESRI 2005). Once delineated, the number of cells that flow into any focal point within a watershed were converted into a measure of catchment area for that point. Some candidate watersheds were excluded from monitoring due to logistical limitations (e.g. safe access).

Figure 1.

 Louisiana waterthrush survey locations in New River Gorge National River (NERI) and Gauley River National Recreation Area (GARI), WV. As illustrated in the transect map at Arbuckle Creek, NERI (inset), each transect contained five point-count stations and could have included as many as four Louisiana waterthrush territories.

Within each selected watershed, a 1-km streamside transect was established within a predetermined range of catchment areas (i.e. 1–99·9 km2). As such, transects were established along reaches that were perennial and wadeable. If >1 km of stream was available, then a series of four adjacent 250-m segments were selected at random. A 1-km streamside transect was expected to contain up to four mostly linear waterthrush territories (Fig. 1), based on territory mapping of colour-banded waterthrushes in this region (Mulvihill, Newell & Latta 2008). Along each 1-km transect, a point-count station was established every 250 m, totalling five stations per transect and 140 points throughout the two parks. Detectability of waterthrush pairs is particularly high during the first month following fledging of nests (Mattsson & Cooper 2006). As such, each transect was visited twice from 23 May to 19 June 2009 to coincide with the peak period of waterthrush fledgling care. Transect visits were 4–20 days apart for any given transect.

During each visit day, one of four observers traversed a transect twice (i.e. upstream and downstream), conducted 5-min point counts at each station during both passes, and recorded per-minute detections (aural or visual) of waterthrush adults or young within an estimated 75 m of the point-count station. This resulted in two levels of temporal replication (Fig. 2). Before conducting any transect surveys, observers underwent ≥5 days of training that focused on improving accuracy in estimating distances to waterthrushes within 75 m using both aural and visual cues. Each point-count station was therefore sampled four times (four passes over two days), and there were five subsamples (five 1-min intervals) per sample (i.e. 4 × 5 = 20 subsamples per point throughout the season). Due to travel time between sites and the limited daily period of waterthrush vocal activity, it was more reasonable for an observer to conduct two passes per transect visit than it would have been for that observer to conduct surveys along two transects per day. To account for observer variability with respect to detection of waterthrushes while minimizing the number of days between visits, one observer conducted counts on both passes along a transect on a given day, and another observer conducted counts on both passes along a transect on the second day.

Early spring is the season when benthic macroinvertebrate communities are typically most diverse (Huryn, Wallace & Anderson 2008); consequently, benthic macroinvertebrates were sampled from 26 of 28 transects during March 2009. This period also coincides with waterthrush territory establishment in the region (Mattsson et al. 2009). The benthic macroinvertebrate sampling protocol was based on methods developed for the US Geological Survey (Moulton et al. 2000, 2002). For more details on sampling methods, see Tzilkowski, Weber & Ferreri (2009). Substrate disturbance sampling, with a 0·25 m2 template and Slack sampler (500 μm mesh), was used to collect subsamples from five riffles throughout each transect. Stream conditions (i.e. substrate, water velocity and depth) were measured and kept consistent among riffle subsamples. These subsamples were composited into one sample for each stream transect, preserved in 95% ethanol and transported to the laboratory. Fixed-count subsamples of 240–360 individuals were identified to genus for all taxa, except for chironomid midges and oligochaete worms, using standard dichotomous keys (Peckarsky et al. 1990; Merritt, Cummins & Berg 2008). For the analysis, the percentage of individuals belonging to the insect orders Ephemeroptera, Plecoptera or Trichoptera (henceforth, % EPT) was calculated for each sample, as this metric is related to waterthrush distribution in other parts of its range (Mattsson & Cooper 2006).

Hierarchical Bayes multi-scale occupancy model

We first illustrate how a single-season, single-scale occupancy model (MacKenzie et al. 2002) is generalized to a single-season, multi-scale occupancy model (Mordecai 2007; Nichols et al. 2008) based on the sampling design for examining waterthrush distribution patterns. In doing so, we largely follow the theory and notation of MacKenzie et al. (2002). Suppose that each transect were visited during only a single day, and a 5-min survey was repeated twice per day at each point-count station (i.e. site) following a temporal replication sampling design (Fig. 2). A single-season occupancy model could then be applied to estimate the probability that a waterthrush occupied a site that day (ψ) and the minute-by-minute probability of detecting the waterthrush (P), given the site is occupied. For a given site-visit day, using 1 to denote a detection and 0 a non-detection to create a detection history, if we only detected a waterthrush during the third minute of the first pass (i.e. a detection history of 00100 00000), then we could conclude that the species occupied the site. Alternatively, if we never detected a waterthrush at the site (i.e. 00000 00000), then either (i) the site was occupied but the species was not detected or (ii) the site was not occupied. As such, minute-by-minute detection/non-detection data provides information to estimate both ψ and P.

In reality, however, transects were visited on two different days (Fig. 3a). A multi-scale occupancy model can therefore be used to estimate (i) the probability that a waterthrush occupied a site (ψ) at least once from the start of the first survey to the end of the final survey of that site, (ii) the probability of use (θ) by a waterthrush during an individual point-count survey given the site is occupied and (iii) the probability of detecting a waterthrush (P) during an individual survey given the site was used. For example, if we only detect a waterthrush on the third minute of the first survey (00100 00000 00000 00000), then we could assert that the species: (i) occupied the site, (ii) used the site during the first survey and (iii) either used the site during the subsequent surveys and was not detected or did not use the site during these subsequent surveys. Alternatively, suppose no individuals were detected during any survey of the site (i.e. 00000 00000 00000 00000). In this case, a waterthrush either: (i) did not occupy the site; (ii) occupied but did not use the site (i.e. species was unavailable for detection) during any survey or (iii) was not detected despite occupying and using the site during either survey. Minute-by-minute detection/non-detection data during surveys that are repeated on multiple days therefore provide information to estimate not only ψ and P, but also θ. Note that a single-scale occupancy model may also be fit to such a data set, and its performance could be directly compared with that of a multi-scale occupancy model. This is analogous to the case where single-season and dynamic occupancy models can be fit to the same data set (MacKenzie et al. 2003).

Figure 3.

 Effect of catchment area and % EPT (in benthic macorinvertebrate community) on probability of occupancy (a) and of daily use (b) by the Louisiana waterthrush within 75-m radius count circles along streams in two national parks of southern West Virginia during spring of 2009. Per cent EPT was held at 70% for graph A, and catchment area was held at 20 km2 in graph B. Dashed lines represent 95% BCIs.

We formulated the multi-scale occupancy model as a state-space model (Royle & Kéry 2007) that comprises two submodels, including a state process model for the latent or partially observed processes of use and occupancy, and an observation model for the repeated detections themselves. The state process model is composed of two equations, starting with the binary site occupancy state:

image

followed by the binary use state, which is conditional on the respective site occupancy state:

image

where, under a temporal replication sampling design (Fig. 3a), i indexes the N sites and j indexes the V surveys. Therefore, a species occupies a site according to a Bernoulli trial with parameter ψ, and the species uses (i.e. is available for detection at) the site during a survey according to another Bernoulli trial with parameter θ. The observation model, which is conditional on the state of use is denoted as follows:

image

where k indexes S subsamples, y is a three-dimensional array of 1’s or 0’s representing detections or non-detections of a species for each site-survey-subsample combination, and P is the corresponding three-dimensional array of detection probabilities for each site-survey-subsample combination. Thus, if the species uses an occupied site during a survey, then the species is detected during that survey according to a Bernoulli trial with parameter P.

To demonstrate how sampling design dictates interpretations of occupancy, use, and detectability, we refer again to the waterthrush sampling design where = 140 point-count stations are surveyed during = 4 surveys, and detection/non-detection data are collected during S = 5 successive 1-min counts during each survey. In this case, parameters could be interpreted as follows: (i) occupancy (ψ) is the probability that a site is usable, i.e. that a waterthrush may use the site; (ii) use (θ) is the probability that a waterthrush uses the site by vocalizing during a survey given a site is occupied; and (iii) detection (P) is the probability of detecting a waterthrush during a survey given that a waterthrush uses the site that day. It is therefore possible for a waterthrush to occupy a site but not use that site during the four surveys.

This design, where the replication is temporal (Fig. 3a), focuses on the frequency that a waterthrush uses a site (or is available for detection). It is important to note that the model structure is easily adaptable for questions focused on spatial patterns in addition to temporal patterns of use among plots within sites (Fig. 3b). In particular, V would instead represent the number of plots per site, and S would represent the number of temporally replicated surveys per plot. Interpretations of occupancy, use, and detection are therefore contingent on the sampling design (Mackenzie & Royle 2005). Spatial replication, however, may introduce Markovian dependence due to animal movements and require approaches that accommodate such dependence (Hines et al. 2010).

Covariates and missing data can be easily incorporated into the state-space multi-scale occupancy model as they have for other state-space occupancy models (Royle & Dorazio 2008). Effects of site-level covariates on ψ, θ, and P, patch or survey-level covariates on θ and P, and subsample-level covariates on P can be modelled using the logit transformation, where Y is the response parameter of interest (i.e. either ψ, θ or P), X is the covariate information and B is the vector of logistic model coefficients for estimation:

image

Under some standard sample designs, sites are nested within larger spatial units (henceforth, aggregates) to improve sampling efficiency or accommodate logistical constraints (Bibby & Burgess 2000; Sauer, Fallon & Johnson 2003; Newson et al. 2008). Spatial dependence due to nested or clustered distribution of species among nearby sites, unless taken into account, may yield biased estimates of distribution (for review see Dormann et al. 2007). In addition to covariates, a random intercept for aggregates (e.g. transects each comprised of multiple survey sites; β0i) can be incorporated into the model to account for this dependence:

image

where a indexes aggregates, i indexes sites and the model may contain any number of fixed effects, indexed by r. In the hierarchical Bayes analysis, prior distributions are defined such that aggregate intercept values share a common mean and variance (Royle & Dorazio 2006; Howell, Peterson & Conroy 2008):

image

This variance then represents variation among aggregates or the level of spatial dependence.

Model assumptions

Obtaining accurate estimates via the multi-scale occupancy model presented here requires important assumptions. Unlike single-scale, single-season occupancy models (MacKenzie et al. 2002), multi-scale occupancy models allow for the possibility that species become occasionally unavailable for detection at a site. For example, a waterthrush may move into or out of a count circle as it passes along its streamside territory. Similar to dynamic occupancy models (MacKenzie et al. 2003), multi-scale occupancy models assume that sites are closed (i.e. availability for detection remains constant) during each primary survey. For example, waterthrushes do not move into or out of a count circle throughout a 5-min point-count survey. Secondly, species are identified correctly upon detection, or no species are misidentified. Again, this assumption may be relaxed to account for false positives, as it has been for single-scale occupancy modelling (Royle & Link 2006). Thirdly, covariates must be included to account for any detection biases, such as differences among observers, temporal variation in species perceptibility (e.g. singing rates), sampling effort and environmental conditions (Mattsson & Marshall 2009). Finally, parameters must be incorporated to account for any dependencies of detections among sites (e.g. spatially clustered or large territories), surveys (e.g. temporally clustered availability for detection) or subsamples (e.g. observer expectation bias or temporally clustered availability for detection).

Waterthrush analysis

We investigated patterns of waterthrush occupancy and use along tributaries by fitting a hierarchical Bayes multi-scale occupancy model to temporally replicated detection/non-detection data (Fig. 2). We assume that use (θ) represents availability for detection during a point count, as waterthrushes move along their ca. 250-m territories throughout the day. Waterthruses are thus only available for detection when they are present within the 75-m detection radius (henceforth, count circle) during a 5-min count. Therefore, occupancy (ψ) is the probability that a point is occupied by ≥1 waterthrushes at least once during the study period, θ is the probability that ≥1 waterthrushes use the count circle during a particular pass given the point is occupied, and P is the probability of detecting a waterthrush during one of the 5 min given they use that count circle on a particular pass. A count circle, therefore, is a site in a traditional, single-season occupancy model.

Occupancy, use, and detectability of waterthrushes in count circles may depend on watershed-scale attributes, including local waterthrush territory density, and this dependence would manifest throughout a transect (Fig. 1). With this in mind, we applied the hierarchical Bayes multi-scale occupancy model that included a random intercept for variation among transects with respect to count-circle ψ and θ. We also assumed that sources of variation in detectability of a waterthrush would be accounted for when considering observer-specific attributes (Mattsson & Marshall 2009) and previous detection of a waterthrush. Detections of waterthrushes may not be independent during a 5-min count due to an expectation bias of observers and/or temporally clustered singing activity of a waterthrush (Riddle et al. 2010). We therefore included in the model fixed categorical covariates for observer (Obs) and detection during the previous minute (Prev). We also included as predictors of ψ, θ and P fixed effects for catchment area and for % EPT. We predicted that ψ and θ would increase with increasing catchment area and % EPT, as these are both expected to correspond directly with waterthrush food availability (Klemm et al. 2002; Mattsson et al. 2009). Catchment area and % EPT had weak, if any correlation (–0·19). As such, we developed a series of logistic regression equations to estimate the level of spatial dependence and relationships between covariates and response parameters of the multi-scale occupancy model:

image
image
image

where i indexes sites, j indexes surveys within sites, k indexes subsamples within surveys, and Obs indexes observers, and Prev is a binary variable for previous detection. Parameters α0a and β0a represent random intercepts that account for transect-level spatial dependence when modelling occupancy and use, respectively, and δ0 represents the fixed intercept for detectability. Parameters α1, β1, and σ1 represent slopes for the catchment area effect, whereas α2, β2, and σ2 represent slopes for the effect of % EPT on occupancy and use, respectively. For modelling detectability, σ3,Obs and σ4 represent the slopes for the effect of observer and previous detection, respectively.

We used WinBUGS version 1·4 (Spiegelhalter et al. 2003) to fit the model to the waterthrush data, which uses an Markov chain Monte Carlo (MCMC) algorithm. R and WinBUGS code for fitting the model is provided in Appendix S1 (Supporting information), and we provided simplified code (i.e. without covariates) for running the hierarchical Bayes multi-scale occupancy model in WinBUGS in Appendix S2 (Supporting information). We chose a relatively uninformative normal prior with a logit-scale mean of 0 (i.e. 0·5 on the probability scale) and standard deviation of 1·58 (i.e. 0·83 on probability scale and precision of 0·4 on logit scale) for the mean of the random intercepts for ψ and θ. Unlike a uniform prior on the logit scale, this normal prior results in an approximately uniform distribution from 0 to 1 on the probability scale. We also used this uninformative normal prior for the remaining parameters, except we used a uniform prior of 0·001–10 on the logit scale for the standard deviation of random intercepts (SDRs) for ψ and θ. For the latter, the lower and upper bounds represent no spatial dependence and strong spatial dependence, respectively. To improve mixing of the MCMC algorithm, we truncated the normal priors from –10 to 10, disallowed values below 1E−5 or above 1–1E−5 for probabilities, and disallowed values below 1E-3 for standard deviation. We exported model output from WinBUGS to programme R (R Development Core Team 2006) and assessed convergence using the default values for the Raftery–Lewis test implemented in the R package BOA, which is based on a single chain of MCMC iterations (Raftery & Lewis 1992a,b; Smith 2007). Inferences regarding effect sizes and direction were based on posterior means and 95% Bayesian Credible Intervals (BCIs; 2·5th–97·5th percentile of the distribution), and parameter estimates are reported as mean with BCI in square brackets. In particular, if the BCI surrounding a slope estimate did not include zero, then we interpreted this as a statistically significant (henceforth, significant) effect.

Results

Waterthrushes were detected in 70 of 140 point-count circles for a naïve occupancy estimate of 0·500, and waterthrushes were detected during 99 of 280 surveys in these 70 count circles for a naïve use estimate of 0·353. Catchment areas ranged from 1·08 to 74 km2, benthic macroinvertebrate densities ranged from 436 to 14 469 individuals m−2, and % EPT ranged from 0·7 to 90·0%. The hierarchical model, when fit to these data, converged after 2·5 million MCMC iterations following 100 000 discarded (i.e. burn-in) iterations. Based on estimates from this model, waterthrush occupancy of count circles increased on average from 59 to 100% across the range of catchment areas while holding % EPT at the mean value across sites (i.e. 70%), and this effect was significant (α1 = 1·775 [0·445, 3·745]; Fig. 3). Likewise, use increased on average from 35 to 66% across the range of % EPT while holding catchment area at the mean value across sites (i.e. 20 km2), and the credible interval was almost entirely above zero (β2 = 1·620 [−0·106, 3·527]; Fig. 3). Minute-specific detectability of waterthrushes increased as catchment size increased by 1 ha, and the credible interval was almost entirely above zero (σ1 = 0·169 [−0·004, 0·330]; Fig. 4). Detectability more than doubled when a waterthrush was detected during the previous minute compared to when no waterthrushes were detected previously (Fig. 4). Catchment area had weak, if any, effect on waterthrush daily use, and % EPT had weak, if any, effects on waterthrush occupancy or detection. Based on posterior distributions for SDRs in the model, spatial dependence among count circles within transects was evident with respect to occupancy and use (Fig. 5). The mode of SDR for occupancy (0·2) was less than that for use (0·8).

Figure 4.

 Per-minute detectability with (a) and without (b) prior detection of the Louisiana waterthrush within 75-m radius count circles along streams in two National Parks of southern West Virginia during spring of 2009. Whiskers represent 95% BCIs.

Figure 5.

 Spatial dependence among count circles within transects with respect to count-circle occupancy and survey-specific use by the Louisiana waterthrush along streams of two National Parks in southern West Virginia during spring 2009. The histogram shows the prior probability density, and the curves show the posterior probability distributions for the logit-scale standard deviation of random intercepts (SDR) for transect. Vertical dotted line at 0 standard deviation indicates the expectation if there were no spatial dependence of occupancy or use.

Discussion

Questions about species distribution and resource use are central to many ecological studies and their application for management and conservation. Hierarchical Bayes multi-scale occupancy modelling is an extension of existing approaches that model occupancy at a single scale while accounting for detectability and/or spatial dependence (MacKenzie et al. 2002; Royle & Dorazio 2006; Mordecai 2007; Nichols et al. 2008). Specifically, hierarchical Bayes multi-scale occupancy modelling allows inference regarding species occurrence at two different spatial or temporal scales while enabling incorporation of random effects that account for nested sampling designs (e.g. count circles along transects). At two different temporal scales, investigators can model both the probability of species occurrence at a site during the study period and the frequency of species occurrence (i.e. use of or availability for detection) at that site while accounting for detectability and spatial dependence among nearby sites. Alternatively, at two different spatial scales, investigators can model both the probability of species occurrence in an area (e.g. management unit) and, if the species occurs in that area, the probability of species occurrence in smaller regions nested within that area (e.g. stands within the management unit) while accounting for spatial dependence among adjacent management units.

An alternative, common approach to study species distribution at multiple scales is radiotelemetry (e.g. Michalski et al. 2006; Matson et al. 2007; Rittenhouse & Semlitsch 2007). Although radiotelemetry can provide data on both spatial and temporal patterns in use by individual animals, it is often logistically challenging and expensive to obtain sufficient sample size to detect differences in use among habitats (Murray 2006). In contrast, detection–nondetection data for highly visible or audible species tend to be inexpensive and easy to collect; thus, conducting repeated surveys of unmarked animals may be more efficient than conducting telemetry for studying patterns of resource use by many species.

Based on our analysis and rather simple sampling design, waterthrushes were not only widely distributed (ψ > 0·6) but were also often using (θ > 0·4) riparian areas along forested mountain tributaries. While accounting for detection biases and spatial dependence among nearby sampling sites, waterthrushes became more common as catchment area increased. Community composition of instream prey varies with watershed size (Klemm et al. 2002; Roy et al. 2003; King et al. 2005), and waterthrushes may be attracted to assemblages of benthic macroinvertebrates found in larger watersheds. Daily use probability increased with increasing %EPT, and this result does not refute the hypothesis that waterthrushes are indicators of benthic macroinvertebrate community composition (Mattsson & Cooper 2006; Mulvihill, Newell & Latta 2008). Until additional macroinvertebrate metrics, water quality metrics (e.g. pH) or sampling methods are explored, catchment area appears to be a more important driver of waterthrush distribution; whereas % EPT is associated with waterthrush availability for detection.

With respect to patterns of detection, waterthrushes that were using an area during a count were easier to detect if they were also detected earlier in that count. While the specific reason for the increased detectability is unclear, observer knowledge of the previous location of individual waterthrushes may have increased the rate of redetection. Alternatively, waterthrushes may have multi-minute bouts of vocalization, yielding clumped minute-by-minute detections. Whatever the mechanism, increased detection probability after a prior detection may be common in bird surveys, and not accounting for this dependence can result in a downward bias in occupancy estimators (Riddle et al. 2010).

Multi-scale perspective of distribution estimates

In traditional occupancy modelling, use and detection probabilities are combined into a single parameter, P. There are many ecological problems, however, where separating use and detection would be particularly important. One common situation, as illustrated by the waterthrush analysis, involves an animal that is absent from large portions of its home range at any given time. In this case, an investigator can apply multi-scale occupancy models to investigate patterns of occupancy within patches of home ranges in addition to frequency of use for those patches. Furthermore, modelling ψ, θ and P in a hierarchical framework becomes particularly informative when, as indicated by the waterthrush example, P < 1 and variation exists among coarser-scale sampling units (i.e. streamside transects) with respect to both occupancy and use.

Modelling θ and P could also be useful for revealing contrasting patterns in use and detection in relation to a factor of interest. For example, many bird species occasionally use non-forested habitat but spend a majority of their time in forested habitat (Lent & Capen 1995; Annand & Thompson 1997; Mordecai, Cooper & Justicia 2009). When these facultative species occupy non-forested habitat, they may be easier to detect due to the reduced visual and auditory obstructions in an open area. However, the probability that these species use non-forest habitat at a given time is lower, because they spend less time in that habitat. While multi-scale and single-scale occupancy models both predict occupancy rates for each habitat, multi-scale occupancy models could also estimate the negative trend in use and positive trend in detectability associated with non-forested habitat. Therefore, multi-scale occupancy models distinguish between parameters that are typically of ecological interest, occupancy (i.e. ψ) and use (i.e. θ) and a parameter that is generally estimated only to account for perceptibility by observers for detecting that species (i.e. P; Johnson 2008).

Multi-scale occupancy models provide estimates of distribution at two scales (i.e. ψ and θ), and these estimates are subject to an important assumption about spatial independence. Using the state-space framework, spatial dependence may be addressed by adding to the process model a random effect that indexes coarser spatial units. As shown in the waterthrush analysis, species distributions may be clustered such that nested sampling designs (e.g. point-transects) warrant inclusion of this random effect to account for spatial dependence.

Comparing multi-scale and dynamic occupancy models

Use may be estimated based on the robust design under two alternative modelling approaches. First, dynamic occupancy models explicitly account for transitions in patch occupancy between successive primary surveys such as months, seasons or years (MacKenzie et al. 2003; Rota et al. 2009). Secondly, multi-scale occupancy models account for the possibility that an occupied patch may be periodically unused by providing estimates of (i) patch occupancy across primary surveys, (ii) use of occupied patches (i.e. availability for detection of at least one individual) during each primary survey and (iii) detectability of species within used patches during secondary surveys (Mordecai 2007; Nichols et al. 2008). Patch occupancy during each primary survey under the dynamic-occupancy modelling approach is analogous to the use parameter in multi-scale occupancy models. Therefore, initial occupancy in a dynamic occupancy model is the likelihood of use during the first season. The two approaches, in fact, provide identical estimates for use of occupied patches under a random immigration model where immigration and emigration are equal.

In contrast with dynamic occupancy models that allow estimation of use for a single scale of sampling units, multi-scale occupancy modelling allows estimation of species distribution at two nested temporal and/or spatial scales. As such, dynamic occupancy models provide a parameter for ‘seasonal’ use (θ) but have no parameter for ‘cross-seasonal’ occupancy (ψ) as defined in multi-scale occupancy models. Researchers interested in examining immigration or emigration between consecutive surveys (e.g. monthly, seasonal or annual) may be better served by the dynamic occupancy model (MacKenzie et al. 2003), whereas investigators that are interested in examining patterns of species distribution at multiple nested scales would be better served by the multi-scale occupancy model described here and elsewhere (Mordecai 2007; Nichols et al. 2008).

Extending hierarchical multi-scale occupancy models

Hierarchical Bayes multi-scale occupancy models can be expanded in numerous ways. Current extensions to single-season occupancy models such as species interactions (MacKenzie, Bailey & Nichols 2004), community-level metrics (Dorazio & Royle 2005; Dorazio et al. 2006), dynamic occupancy models (MacKenzie et al. 2003), and false positives (Royle & Link 2006) could all be applied to multi-scale occupancy models. Additionally, probabilities of occupancy, use and detection at any scale could be estimated with double-observer sampling (Cook & Jacobson 1979), removal models (Moran 1951; Seber 1982) or distance sampling (Reynolds, Scott & Nussbaum 1980; Buckland, Burnham & Laake 1993). In conclusion, hierarchical Bayes multi-scale occupancy models have many potential applications and extensions for studying the distribution and resource use patterns of mobile or episodic species that exhibit spatial heterogeneity.

Implications for conservation planning

Conservation organizations are increasingly challenged by complex threats, such as climate change, which may affect species distributions at multiple scales (Elith & Leathwick 2009; Galatowitsch, Frelich & Phillips-Mao 2009). Evaluating conservation policies to address these threats will probably require analysis of clustered detection–nondetection data for elusive species across a wide range of spatial and temporal scales. The proposed hierarchical Bayes extension to multi-scale occupancy models will allow conservation organizations to evaluate alternative management options while accounting for challenges associated with clustered sampling designs for species that are highly mobile or episodic.

Ancillary