An integrated model of habitat and species occurrence dynamics


Correspondence author. E-mail:


1. Relationships between animal populations and their habitats are well known and commonly acknowledged to be important by animal ecologists, conservation biologists and wildlife managers. Such relationships are most commonly viewed as static, such that habitat at time t is viewed as a determinant of animals present at that same time, t, or sometimes as a determinant of animal population or occurrence dynamics (e.g. between t and t+1).

2. Here, we motivate interest in simultaneous dynamics of both habitat and occupancy state (e.g. species presence or absence) and develop models to estimate parameters that describe the dynamics of such systems.

3. The models permit inference about transition probabilities for both habitat and focal species occupancy, such that habitat transitions may influence focal species transitions and vice versa.

4. Example analyses using data from salamanders in the eastern United States are presented for (i) the special case in which habitat is characterized as either suitable or unsuitable and (ii) the more general case in which different habitat states are expected to influence occupancy dynamics in a less extreme manner (occupancy is possible in the various habitat states).

5. We believe that the integrated inference methods presented here will be useful for a variety of ecological and conservation investigations and attain special relevance in the face of habitat dynamics driven by such factors as active management, land use changes and climate change.


Many species exhibit preferences for certain habitats such that the probability that a site is occupied by a species is a function of the site’s habitat. Resulting spatial variation in the occurrence of a species has motivated the use of habitat as a covariate or predictor variable for models of species occurrence (e.g. Scott et al. 2002). However, often the habitat itself is not static but instead changes through time via such processes as vegetation succession, human activities and environmental variation (e.g. the availability and duration of water at vernal pools depends on seasonal rainfall). Such changes are expected to influence species occurrence and the vital rates responsible for occupancy dynamics (local extinction and colonization). When habitat is dynamic, it will frequently be useful from both scientific and conservation perspectives to partition species occupancy dynamics into components associated with habitat versus other factors. The theoretical literature on metapopulation dynamics contains multiple pleas to study dynamics of species occurrence and habitat simultaneously, focusing on habitat changes occurring as a result of ecological succession (Ellner & Fussmann 2003), natural disturbance frequency (Amarasekare & Possingham 2001) and human activities (Lande 1987, 1988; Thomas 1994).

In many situations, sampling for species occurrence is restricted to those sites within an overall system that contain suitable habitat, i.e., sites at which species occurrence is possible. In such situations, the state variable of species occurrence is likely to be inadequate as a sole descriptor of system state, as similar occurrence proportions may have very different meanings with respect to system well-being, depending on the number of sites with suitable habitat. For example, consider the common situation of shrinking suitable habitat for a species. Occurrence of the species in those suitable patches that remain may be high (most suitable sites occupied), yet because the number of suitable sites is decreasing, the overall well-being of the species is declining (e.g. Lande 1987). In this case, it would be more informative to characterize the system by two state variables, the proportion of sites at which habitat is suitable and the proportion of suitable sites at which the species is present. In general, separation of overall system dynamics into the fundamental components of habitat and species occurrence dynamics should yield a better understanding about the system and its well-being.

Here, models are developed that explicitly account for both habitat and species occurrence dynamics. Initially, the modelling approach is presented for a simple case where the habitat at a sampling unit may be 1 of 2 possible types (A and B), and the target species has a non-zero, and likely different, probability of presence at a unit of either habitat type. It is assumed that habitat type is known without error. We also begin by assuming a very general case that allows not only for an effect of habitat on species occurrence dynamics, but also for the habitat dynamics to depend upon the presence of the species. This latter possibility might occur through habitat modification by the species when present at a unit (e.g. through grazing of vegetation, McNaughton 1983; McNaughton, Banyikwa, & McNaughton 1997) or indicate that the species selects locations within a habitat that exhibit dynamics differing from locations not selected (e.g. vernal pools selected by breeding amphibians may be less likely to dry as quickly as those not selected). This work builds primarily upon methods outlined in MacKenzie et al. (2006) and assumes that a target species is likely to be imperfectly detected when sampling units are surveyed; however, the modelling is equally applicable for situations where the probability of detecting the species is believed to be 1 (e.g. Martin et al. 2010). Recently developed multi-state, multi-season occupancy models (MacKenzie et al. 2009) provide the estimation framework for the dynamic rates.

Basic sampling situation

We envisage a situation where a collection of sampling units is surveyed at systematic points in time, where at each time point, the habitat state of a unit is recorded, as is the presence or absence of the target species. When there is the possibility of a false absence, through imperfect detection of the species, then repeated detection/non-detection surveys are conducted within a relatively short-time period at each unit such that detection probability can be estimated and appropriately accounted for (e.g. MacKenzie et al. 2003, 2006; Royle & Kery 2007). Following MacKenzie et al. (2006), we refer to this short period during which the repeat surveys are conducted as a sampling season. Within a sampling season, the habitat state and the presence of the species is assumed to be fixed. The sampling units themselves may be defined arbitrarily, such as quadrats or grid cells of a predetermined size, or defined naturally as discrete patches of habitat such as ponds or forest fragments.

Model development and estimation

In Table 1, we define parameters that can be used to denote the probability of being in each state in the first season, the probability of transitioning between any two states between successive seasons, and the probability of detecting the species. In the first season, one can only reliably determine current patterns of species occurrence and habitat across the area of interest, as the previous state of a unit is unknown (unless it is assumed the system is currently at equilibrium); hence, the first season needs to be modelled slightly differently than subsequent seasons. With these parameters, an expression can be developed for the probability of observing any particular sequence of detections/non-detections and habitat changes. This process is relatively straightforward with the key being to first develop a verbal description and then translate that into a probability statement by substituting each phrase with the appropriate parameters. For example, suppose that during three sampling seasons, the habitat at a sampling unit is in state A in the first two seasons, and state B in season 3 (i.e. H = A A B). Over the same period, the unit is surveyed twice per season (because of imperfect detection of the focal species) resulting in the following species detection history, h = 10 00 11, which denotes that the species was detected in the first survey of season 1, never detected in season 2 and detected twice in season 3. A verbal description of the possible processes of habitat and species occurrence dynamics at this unit would be: unit is in habitat state A in the first season, with the species being present and detected in the first survey, but not in the second survey (inline image). Between seasons 1 and 2, that habitat remained in state A (i.e. habitat did not change), given the species was present in season 1 (inline image) and either: (i) the species did not go locally extinct (so it was still present in season 2) but was not detected in either survey, and then between seasons 2 and 3 the habitat changed from state A to B and the species did not go locally extinct (inline image); or (ii) the species went locally extinct (so was absent and could not be detected in season 2), then between seasons 2 and 3 the habitat changed from state A to B given the species was absent in season 2 and the species colonized the unit (inline image). Finally, given the unit was in habitat state B and occupied by the species in season 3, the species was detected in both surveys (inline image). Therefore, the associated probability of observing this entire detection history can be expressed as:

Table 1.   Definitions of parameters used to develop joint habitat-occupancy models
π[H]Probability a sampling unit is of habitat state H in the first season.
ψ[H]Probability the species is present at a unit of habitat state H in the first season.
inline imageProbability the habitat changes from state Ht in season t to state H+ 1 in season + 1, given the species was either present (= 1) or absent (= 0) from the sampling unit in season t.
inline imageProbability species colonizes a unit between seasons t and + 1 (i.e. species absence to presence) given the habitat has transitioned from state Ht in season t to state H+ 1 in season + 1.
inline imageProbability species goes locally extinct from a unit between seasons t and + 1 (i.e. species presence to absence), given the habitat has transitioned from state Ht in season t to state H+ 1 in season + 1.
inline imageProbability of detecting the species in survey j of season t, given the habitat state in season t is H.

Note that the ambiguity of whether or not the species was present in the second season is reflected in the probability statement by adding together the probabilities of the two possible sequences of events between seasons 1 and 3 (in the square brackets above).

In the situation where it is assumed the species is detected perfectly, then detection histories and associated probability statements will be simplified, e.g., if we have the same habitat changes as above and only one (perfect) survey per season resulting in the detection history h = 1 0 1, then:


The ambiguity at season 2 is resolved, as perfect detection now permits the conclusion that the species was indeed absent (locally extinct).

Practically, however, when species are detected imperfectly, there may be ambiguity as to the true state of a unit in multiple successive sampling seasons, resulting in a potentially large number of possible outcomes that need to be accounted for. However, by recognising that in any season, a unit must be in one of a number of mutually exclusive states, the multi-state, multi-season occupancy model of MacKenzie et al. (2009) can be used to provide an estimation framework by defining appropriate probabilities for transitions between states and observations within seasons in terms of matrices and vectors. This allows the probability of any observed history to be expressed in a rather general and compact form.

During a sampling season, with only two habitat states and two levels of occupancy (presence/absence), a sampling unit can be in one of four mutually exclusive states; (i) habitat A and species absent; (ii) habitat A and species present; (iii) habitat B and species absent; and (iv) habitat B and species present (more generally, if there are m habitat states and n levels of occupancy, there would be m × n possible mutually exclusive states). Therefore, between seasons, there are 16 possible transitions to move from any of the four possible states in season t to any of the four states in season t+1.

In the first season, the probability of a sampling unit being in each of the four possible states can be conveniently expressed as the vector (the prime indicates this is the transpose of a row vector):


where the ordering of the states follows that defined above. Between seasons, the probability of a unit transitioning from one state to another could be defined as a transition probability matrix (φt), where rows indicate the state of a unit at time t, and columns indicate the state at t + 1. For example, the probability of a unit going from habitat A and unoccupied at time t (row 1) to habitat A and occupied at t+1 (column 2) is inline image. Similarly, the probability of a unit transitioning from habitat B and occupied at time t (row 4) to habitat A and unoccupied at t+1 (column 1) is inline image. The entire transition matrix is presented below:


Note that the transitions can also be considered a two-step process where first the habitat transitions to either state A or B (where the probability of the transition is potentially different depending on which of the four states the unit was in last season), then conditional upon the habitat transition, the species either colonizes or not, or goes locally extinct or not, at the unit.

In terms of the detection probability matrix used by MacKenzie et al. (2009) for the multi-state, multi-season occupancy model, the probability of observing a particular outcome from a single survey within a season given the true state of the unit can be described as:

True stateObserved state
1 [A,0]2 [A,1]3 [B,0]4 [B,1]
1 [A,0]1000
2 [A,1]inline imageinline image00
3 [B,0]0010
4 [B,1]00inline imageinline image

where the habitat state and the presence/absence of the species are indicated within the square brackets. Possible ambiguity in the true state through imperfect detection of the species is incorporated by having more than one entry in columns where the species was observed as absent (more correctly, non-detected). The zero entries indicate (assumed) impossible observations, given the true state of a unit. For example, in the lower left corner of the matrix, zero entries indicate that if a unit is truly of habitat state B, it could not be observed as habitat state A.

A state-dependent, conditional detection probability vector (inline image; note that we now use h to denote a detection history that includes information on both habitat and species detection) can then be defined for any observed detection history within a season. This vector simply has four elements corresponding to each of the four possible states, where each element is an expression for the probability of observing the particular detection history at a unit conditional upon the unit being in the associated state. For example, suppose in season t, the habitat was state A and the detection history 011 was observed, then,


The unit was known to be state A and occupied, so the latter two elements must be zero (these elements are associated with the two occupancy states for habitat B), and the probability of getting at least one detection if the unit was habitat A but unoccupied by the species (the first element) must also be zero. There is only one non-zero element because when the species is detected and habitat state is known, there is only one possible true state for the unit in that season. However, when the species is never detected in a season, there are two possible states resulting in two non-zero elements. For example, suppose a unit is habitat state B at time t, but the species was never detected. The conditional detection vector would be:


If the site was unoccupied by the species (element 3), the probability of never seeing the species is 1, but if the species had been present (element 4), then it was not detected in any of the three surveys. The probability of observing a particular detection history at a unit for all T seasons can then be obtained using matrix multiplication which automatically accounts for the ambiguity caused by imperfect detection of the species (MacKenzie et al. 2003, 2006, 2009);


where θ is just a vector of all the parameters to be estimated, and inline image is a diagonal matrix with the elements of inline image on the main diagonal (top left to bottom right) and zero elsewhere. The first and last terms are outside of the product statement to ensure the matrix multiplication yields a single value. Assuming the detection histories are independent for each unit, the joint probability for the data (and the model likelihood) is,


where s is the total number of units surveyed during the study.

The model likelihood could then be used in a maximum likelihood framework to produce maximum likelihood estimates of parameters, or used in a Bayesian framework to obtain posterior distributions of parameters. Note the same basic model structure can be recast into a data augmentation approach (also known as a state-space, latent-variable or hierarchical model approach), where the unknown state of a unit (caused by imperfect detection) is imputed during the estimation procedure (e.g. Dorazio et al. 2006; Royle & Kery 2007; MacKenzie et al. 2009; Schofield, Barker, & MacKenzie 2009; Chilvers, Wilkinson, & MacKenzie 2010).

The above modelling was motivated by situations in which the dynamics of habitat and species occurrence are linked, to some degree. This linkage or dependence requires the detailed modelling of the type described to properly capture system dynamics. Nevertheless, there may still be interest in summary statistics that can be viewed as condensed versions of estimated model parameters. For example, assuming that the entire system is comprised of two habitats, A and B, summaries may include the probability of a randomly selected unit being in one of the four possible states in each season, or a system-wide estimate of the probability of occupancy (the probability that a randomly selected sample unit is occupied regardless of habitat type).

If we define the probability of being in each state in season 1 as ϕ1 = φ0, then the probability of being in each state in season t+1 can be derived using the following expression: ϕt+1 = ϕtφt. Note the similarities between this expression and that typically used in abundance based population modelling (e.g. Williams, Nichols, & Conroy 2002) to project population sizes forward in time. System-wide occupancy in season t can then be determined by summing the first and third elements of ϕt.

Model extensions

The general framework provided here permits many extensions. Here, we note such extensions in summary fashion in an attempt to stimulate additional work with these models. As with our previous modelling efforts in this general field, it is possible to account for missing observations and unequal sampling effort. When habitat state is known each season, then accounting for missing survey outcomes proceeds exactly as for the standard occupancy models where the detection probabilities for the respective surveys and sampling units are effectively set to zero (MacKenzie et al. 2006). When habitat state is unknown, then essentially one must integrate over the possible habitat states in each season. This is relatively easy to achieve by adjusting the conditional detection probability vector. For example, suppose that three surveys were conducted resulting in the history 010, but the habitat state is unknown (denoted by superscript H), then,


Or, if both the habitat and detection history are unknown (i.e. the unit was not surveyed that season), then


Covariate information or predictor variables can be easily included in these models by applying an appropriate link function (e.g. logit link) to the respective probabilities. Program PRESENCE ( allows such modelling to be conducted within the maximum likelihood framework.

In many situations, there are likely to be more than two habitat states of interest (e.g. successional growth at a unit, or water levels in a pond). Extension of the above modelling to >2 habitat states simply requires additional possible combinations of habitat and occupancy states in the respective state variable vector. For example, if there were four possible habitat states, there would now be eight possible habitat-occupancy combinations.

Rather than just the presence/absence of the species, there may be multiple states of occupancy, e.g., species present with/without breeding or with few/some/many individuals. To account for such situations, the above modelling could be extended in a similar manner as for when there are >2 habitats. Parameterizations suggested by Royle & Link (2005), Nichols et al. (2007) or MacKenzie et al. (2009) could be used with respect to the multiple occupancy state component.

Extending the above modelling to the presence/absence of multiple species would also be relatively simple. Now, the number of possible states is H × 2K, where H is the number of habitat states, and K is the number of species. Transition probabilities would then be defined in terms of both habitat changes and colonization and extinction probabilities for each species, where these probabilities may depend upon the presence or absence of other species (MacKenzie et al. 2006). Clearly, however, as the number of possible states increases (as in the previous extension), so will be required samples to provide reasonable estimates of the additional parameters.

Our model was based on the assumption that habitat state can be judged without error by visiting a sample unit. This situation holds true for the field sampling situations that motivated our interest in these joint models. However, we can also envisage cases in which habitat assessment is subject to misclassification. Possible situations might involve the presence or absence of some critical resource (e.g. food plant, host species) or perhaps pathogen (disease agent; see McClintock et al. 2010) at a sample unit. In such a situation, detection of the critical resource (or pathogen) would permit unambiguous classification of the habitat state of the sample unit. However, failure to detect the resource or pathogen could result from either true absence, or resource/pathogen presence coupled with non-detection. This situation corresponds to that discussed above for the general multi-state models, in which observed states are ordered with respect to information content (see MacKenzie et al. 2009 for full discussion). In such cases, inference about habitat state is possible, and models can be readily developed to deal with this additional classification uncertainty. Where habitat classification does not involve such ordered states with at least 1 unambiguous state, additional information may be required to insure identifiability of parameters for joint habitat-occupancy models. For example, such models would include misclassification parameters of the habitat state, Pr(true habitat state A | field classification as habitat state B). Such parameters could be informed by field experiments or intensive surveys of a subset of sample units. These ancillary data could be included as an additional component in likelihoods such as those shown above, with most of the information about misclassification probability parameters coming from this component.

Most of the extensions listed above involve increased generality, but we note that the models presented here can be constrained for special situations as well. For example, several theoretical investigations of joint habitat-occupancy models (e.g. Lande 1987, 1988) have defined habitat simply as either suitable or unsuitable for the species. If the habitat is suitable at a unit, that implies there is a non-zero probability of the unit being occupied by the focal species, while unsuitable habitat means the probability of occupancy is zero at that unit. This simply reduces the number of possible states to three; (i) unsuitable habitat and species absence; (ii) suitable habitat and species absence; and (iii) suitable habitat and species presence. By definition, some of the rate parameters underlying occupancy dynamics must also be constrained; inline image and inline image where S and U denote suitable and unsuitable habitat, respectively.

Example applications

Example 1: Spotted salamander breeding surveys

We apply the above model to seasonal pools that serve as breeding habitat for spotted salamanders (Ambystoma maculatum) in Canaan Valley National Wildlife Refuge (NWR), West Virginia, USA. Pools were located using an adaptive cluster sample based on an initial random sample of points (Van Meter, Bailey, & Grant 2008). Sixty-three located pools have been sampled since 2005, and most pools are visited 1–3 times during the breeding period each year (generally the 4 weeks in April). Egg mass surveys are usually conducted by two independent observers, yielding 2–6 sample occasions per pool per season (year). Each observer searches the pool and records the number of spotted salamander egg masses seen. Maximum length and width measurements of the pools are also recorded for each visit.

Here, we condense egg mass counts to simple detection/non-detection data and use the pool measurements to define two habitat states, one of which is assigned to each pool each year (2005–2008). Examination of the pool measurements revealed that the median elliptical surface area among sampled pools was ≤25 m2 each year; thus, a pool’s habitat was designated as ‘small’ (Sm) in seasons when surface area measurements did not exceed 25 m2, and ‘large’ (Lg) otherwise. Combined with the salamander occupancy information, we define the following four mutually exclusive pool states: (i) small with no egg masses (unoccupied): (ii) small with egg masses: (iii) large and unoccupied: and (iv) large and occupied by egg masses.

In this example, we used a set of 32 models to explore a subset of ecologically relevant questions. We modelled habitat transitions as a function of either time (year), inline imageor salamander occupancy, inline image, and incorporated both structures into our candidate model set. For simplicity, we explored only two detection probability structures: first, we assumed egg mass detection probability did not differ among surveys within years, but may have differed among years, p(t). Next, we allowed detection probability to vary among years and habitat states, using an additive structure p(t+H).

In the first year, we expected large pools to have higher salamander occupancy probabilities than small pools (i.e. ψ[Lg] > ψ[Sm]). We also suspected that salamander vital rates would depend on the change in habitat state, but otherwise would be relatively constant among years. Specifically, we hypothesized that a smaller, occupied pool at time t is more likely to go locally extinct if the pool remains small than if the pool is large during the next breeding season (ε[Sm,Sm] > ε[Sm,Lg]). Likewise, we suspected that an unoccupied, small pool at time t would have a higher colonization probability if the pool became large the following year (γ[Sm,Lg] > γ[Sm,Sm]). Under the same hypotheses, we expect ε[Lg,Sm] > ε[Lg,Lg] and γ[Lg,Lg] > γ[Lg,Sm]. To formally test our hypotheses, we fit models where vital rates: (i) depended on the habitat state in both years, t and t+1 (denoted inline image and inline image), (ii) depended on the habitat state only in year t (i.e. ε[Sm,Sm] = ε[Sm,Lg], ε[Lg,Sm] = ε[Lg,Lg], γ[Sm,Sm] = γ[Sm,Lg], γ[Lg,Lg] = γ[Lg,Sm], denoted inline image and inline image), (iii) depended on the habitat type in the subsequent year t+1 only (denoted inline image and inline image) or (iv) did not depend on habitat and was constant among years (denoted inline image and inline image).

We fit our models to the data from 63 pools sampled from 2005 to 2008 using program PRESENCE (Table 2). Models were ranked using Akaike’s Information Criterion (AIC) with the top two models accounting for 58% of the model weight. Collectively, the top models suggested that habitat dynamics were more influenced by time (years) than occupancy state, and occupancy dynamics were better explained by the habitat state in the following year (Ht+1), than in the current year (Ht). Model-averaged estimates have been calculated to account for model selection uncertainty (MacKenzie et al. 2006).

Table 2.   Model selection summary for the top 10 habitat-occupancy models fit to spotted salamander data from 63 pools in Canaan Valley National Wildlife Refuge (NWR), 2005–2008
  1. K is the number of estimated parameters in the model, and ΔAIC is the relative difference in Akaike’s Information Criterion (AIC) values, w is the AIC model weight and −2l is twice the negative log-likelihood. Pool habitat state, H, during each season is categorized as small ‘Sm’ (always <25 m2) or large ‘Lg’ (>25 m2). Habitat transition probabilities (η) were allowed to be different according to the habitat type at both time t and + 1 and were either season specific only, inline image, or were different if salamanders were present at time t only, inline image. Extinction (ε) and colonization (γ) probabilities are modelled as a function of the habitat state in seasons t, t+1, or both (e.g. inline image) or as constant among habitat states and seasons (denoted inline image and inline image). For all models, the distribution of habitat types among pools in 2005 is allowed to be unequal, inline image. We have excluded this parameter from our model notation to simplify.

inline image170·000·34910·54
inline image180·650·24909·19
inline image152·890·08917·43
inline image163·650·05916·19
inline image174·060·04914·60
inline image154·340·04918·88
inline image164·500·04917·04
inline image164·660·03917·20
inline image135·160·03923·70
inline image145·570·02922·11

In 2005, pools had approximately the same probability of being large or small with inline image (inline image). Consistent with our a priori expectations, habitat state influenced initial salamander occupancy probabilities (cumulative AIC weight for inline image models = 0·90), and larger pools had a higher occupancy probability than smaller pools with inline image (inline image) and inline image (inline image).

Salamander extinction and colonization probabilities were most influenced by habitat conditions at the end of the interval over which dynamics were estimated. Thus, the pool’s habitat state just prior to the following breeding period (at time t+1) was more important in determining transitions than the current habitat state. Cumulative AIC model weights for the four occupancy dynamic structures were: inline image, inline image, inline image and inline image. As expected, an unoccupied pool in year t had a higher probability of being colonized if it was large the following year, inline image (inline image), compared with a small pool in year t+1, inline image (inline image). Likewise, occupied pools were less prone to local extinction if the pool remained, or became, large the following year, inline image (inline image) compared with pools that were small in year t+1, inline image (inline image).

Most model selection uncertainty involved structures for habitat dynamics and detection probabilities. Habitat dynamics seemed more influenced by time (year) than occupancy state (cumulative model weights = 0·80 and 0·20, forinline image and inline image, respectively). In general, small pools remained small between years, but larger pools had ≈0·10–0·50 chance of being small during the next breeding season, depending on the weather conditions for that year (Fig. 1). Finally, overall detection probabilities were high (>0·50), with some evidence that p varied among habitats (cumulative weight, inline image = 0·44), with a slightly higher probability (≈0·03) of detecting salamander eggs in large pools (Fig. 2).

Figure 1.

 Model-averaged estimates of the probability of a pool being small in year t+1 for sampled pools at Canaan Valley National Wildlife Refuge. Pools were classified as small (Sm, ≤25 m2) or large (Lg, >25 m2) during each of four breeding periods, from 2005 to 2008. Error bars indicate ±1 SE.

Figure 2.

 Model-averaged estimates of detection probabilities for spotted salamander egg masses in occupied pools at Canaan Valley National Wildlife Refuge. Pools were classified as small (≤25 m2) or large (>25 m2) during each breeding period. Error bars indicate ±1 SE.

Example 2: Suitable or unsuitable habitat

Simply monitoring amphibian breeding activity may not differentiate ‘source’ and ‘sink’ habitats, as some pools may support breeding activity, but eggs or larvae may rarely survive to metamorphosis (i.e. these pools routinely experience complete reproductive failure, Taylor, Scott, & Gibbons 2006). Here, we explored the prevalence of reproductive failure and its potential influence on the occurrence of spotted salamanders at 56 pools sampled from 2006 to 2008 at Patuxent Research Refuge (PRR) in Maryland, USA. Pools were located in the same manner as described for Canaan Valley NWR, and most pools were visited twice during the breeding period (late March-mid April) by two independent observers. Occasionally, a second breeding visit was not conducted if egg masses were detected during the first visit. At PRR, an additional dip-net survey was conducted by independent observers in early June to detect late-stage larval salamanders prior to metamorphosis (Worthington 1968; Petranka 1998).

Because two independent observers were employed during pool visits, we considered each visit a ‘season’ with two sampling occasions per ‘season’. Thus, we are able to estimate changes in the habitat and/or occupancy within a year: between the beginning and end of the spring breeding period, and changes between the end of the breeding period to metamorphosis (early summer). We also estimate habitat and/or occupancy changes between years, from metamorphosis (early summer) to the next breeding opportunity (following spring). Habitat during each visit/season was considered ‘suitable’ if there was standing water in the pool, or ‘unsuitable’ if the pool was dry. In the model notation, we denote suitable and unsuitable habitat states with S and U, respectively. Evidence of salamander reproduction (occupancy) was conditional on suitable habitat, reducing the number of states to three; (i) pool is unsuitable (pool must be unoccupied); (ii) pool suitable and unoccupied; and (iii) pool suitable and occupied. Thus, the constraints noted in the final paragraph of EXTENSIONS were applied. Using this modelling framework, we can formally test whether the salamander occupancy state is static over visits within a year, given the habitat remains suitable. Asynchronous migration of adult salamanders to breeding pools (colonization), heavy predation of egg masses or larvae, or a disease outbreak (local extinction) are processes that would lead to occupancy dynamics within the annual study period, corresponding to the salamanders’ aquatic life-history phase. A priori we expected some low-level colonization between the first and second seasons each year, but we expected low or no local extinction, if the pool remained suitable. Stated differently, for those pools occupied by egg masses at the beginning of the breeding period, we expect spotted salamander occupancy probabilities to remain high if the habitat remains suitable; however, a high proportion of pools may become unsuitable before metamorphosis (leading to complete reproductive failure). Finally, we explored whether habitat and/or occupancy state in June (larvae/no larvae) influenced the probability that salamanders breed (egg masses present) at the beginning of the following breeding period. Although spotted salamander metamorphs will not return to breed for 2–5 years (Petranka 1998), we suspect that pools with suitable habitat and successful metamorphosis (occupancy) in June likely had successful reproduction in the past and thus a relatively large population of breeding adults.

As stated in the previous example, we assume that the habitat state (suitable/unsuitable) is observed without error during each season and that salamander egg masses and larvae are only present at pools with suitable habitat. We did not expect salamander detection probability to vary among years, but we did expect detection probabilities to be higher for egg masses than larvae. Additionally, one highly experienced observer participated in nearly all pool visits, so we modelled detection probabilities as a function of observer (observer covariate: 1 = highly experienced, 0 = all other observers). For simplicity, we used this detection probability structure in all models, p(stg × obs).

We fit seven models to data from 2006 to 2008. Each year, pools were visited up to three times per year, yielding T = 9 total ‘seasons’, each with two independent observers (surveys). The probability a pool contains suitable habitat at time t+1 depends on its current habitat (Ht, either suitable or unsuitable) and occupancy state (Xt, either 0 or 1), inline image. We modelled inline image in two ways; first, we expected the relative difference between inline image parameters to be consistent among visits. Specifically, we expected the probability that a pool was suitable at time t+1 would be lowest for unoccupied, unsuitable pools at time t and highest for occupied, suitable pools (inline image); thus, we employed this relationship across all seasons (denoted inline image). Alternatively, we allowed inline image to vary independently among occupancy and habitat states for those seasons within years, but assumed the relationship was consistent among years (denoted η(X, t′, Ht, S) ). Conditional on a pool having suitable habitat, we fit models where: (i) colonization and extinction probabilities within years were zero (notice this represents the classic closure assumption and is denoted inline image), (ii) colonization was allowed between the first and second seasons each year, but extinction probability was constrained to zero (i.e. this model suggests that all pools with egg masses will produce metamorphs if the habitat remains suitable, denoted inline image) and (iii) colonization was allowed between the first and second seasons each year, and extinction probabilities were estimated between all seasons (i.e. both colonization and extinction allowed within and between years, ɛ(tSS)γ(yrHtS)). We combined these dynamic habitat and occupancy structures into six models and added one additional model to verify that both habitat and occupancy state in June influenced the probability a pool supports breeding the following year. For this final model, we re-ran the most general model with time-specific habitat and occupancy transitions, but constrained inline image for visits = 3 and = 6, representing between year transitions. If this model were supported by the data, it would suggest that occupancy probabilities at the first season of each year do not depend on the pool’s occupancy state at the end of the previous year, rather the only requirement is that the pool has suitable habitat.

We fit our seven models to the data using Program PRESENCE with models ranked using AIC (Table 3). The top model, inline image, accounted for 99% of the model weight. The model suggested that habitat dynamics varied among all seasons, and that occupancy state was not static within years. Detection probabilities were high, with inline image for egg masses and inline image for larval salamanders, regardless of observer experience.

Table 3.   Model selection summary for models fit to spotted salamander data from 56 pools at Patuxent Research Refuge, 2006–2008
  1. K is the number of estimated parameters in the model, ΔAIC is the relative difference in AIC values, w is the Akaike’s Information Criterion (AIC) model weight and −2l is twice the negative log-likelihood. For all models, detection probability is a function of salamander life stage (egg mass or larvae) and observer, p(stg × obs). Detection parameters are excluded from model notation to simplify. The probability that a pool contains suitable habitat in season + 1 is either an additive function of time (season) and habitat and occupancy states in the previous season inline image, or else varies independently for the different habitat-occupancy states in season t and among seasons within years, inline image (see text). Conditional on suitable habitat, local extinction and colonization probabilities were constrained to represent no change in occupancy within years inline image, inline image, colonization only between the first and second visits within years, inline image, inline image, and both colonization and extinction within years inline image, inline image. The model where occupancy probabilities between years depended only on habitat availability in March is denoted with †.

inline image340·00·99907·23
inline image309·970·01925·20
inline image2822·300·00939·53
inline image3332·720·00941·95
inline image2774·100·00995·33
inline image221143·320·002074·55
inline image211192·880·002126·11

In March 2006, the probability a pool contained suitable habitat was inline image (inline image), and the probability that spotted salamanders laid eggs in these pools was inline image (inline image). Suitable habitat changed very little during the spotted salamander breeding period (March-April) or between years (Table 4). Pools that were suitable (had water) in June were always suitable the following March, and these same pools usually remained wet through April each year. Pools that were dry in June had ∼80% chance of being suitable at the beginning of the following breeding season, and dry pools in March sometimes became suitable later in the breeding period, provided rainfall events occurred (e.g. in 2006 and 2007, 25–30% of pools that were unsuitable in March became suitable in April, Table 4). There was tremendous yearly variation in the probability that a suitable pool in April retained water until June, and the probability was higher for occupied pools (Table 4). For example, in 2007, only 24% of occupied pools in April were still suitable in June, allowing for possible metamorphosis, while in 2008, nearly all occupied pools remained suitable (Table 4).

Table 4.   Estimates of habitat dynamic transition probabilities and associated standard errors (in parenthesis) for seasonal pools at Patuxent Research Refuge
t+1 = 2
t+1 = 3
t+1 = 4
t+1 = 5
t+1 = 6
t+1 = 7
t+1 = 8
t+1 = 9
  1. The probability that a pool’s habitat is suitable at time + 1, inline image, depends on habitat (H) and occupancy (X) state at time t. Pools that lack water are considered unsuitable (U). Suitable (S) pools are either unoccupied (= 0) or occupied (= 1) by spotted salamander eggs or larvae.

inline image0·30 (0·10)0 (–)0·83 (0·06)0·25 (0·18)0 (–)0·81 (0·06)0·01 (0·01)0·03 (0·03)
inline image0·99 (0·01)0·16 (0·08)1·0 (–)0·99 (0·02)0·02 (0·02)1·0 (–)0·66 (0·10)0·86 (0·15)
inline image1·0 (–)0·70 (0·09)1·0 (–)1·0 (–)0·24 (0·12)1·0 (–)0·96 (0·03)0·99 (0·02)

Local colonization occurred both within and between breeding periods (Table 5). Spotted salamanders appeared to delay reproduction in 2006: the probability a suitable pool contained egg masses was only inline image (inline image) in March, but unoccupied pools that became, or remained, suitable until April had colonization probabilities of inline image (inline image) and inline image (inline image), respectively. In 2007 and 2008, most salamanders had bred by the first visit, and there was little colonization between the first and second seasons and only at pools that were suitable in March (Table 5). Pools that were dry in June had approximately 40–47% chance of being occupied by breeding adults at the beginning of the following breeding season, while colonization probabilities for unproductive (unoccupied by larvae) but suitable pools, γ[S,S], varied widely between years (Table 5).

Table 5.   Local colonization probability estimates and associated standard errors (in parenthesis) for seasonal pools at Patuxent Research Refuge. Colonization probabilities are reported for previously unoccupied pools that were either dry (unsuitable, inline image ) or wet (suitable, inline image ) at time t
t+1 = 2
t+1 = 4
t+1 = 5
t+1 = 7
t+1 = 8
  1. *Estimates were not possible because very few unsuitable pools became suitable.

inline image0·62 (0·17)0·47 (0·09)0 (–)0·40 (0·07)na*
inline image0·47 (0·11)0·91 (0·08)0·05 (0·05)0 (–)0·19 (0·10)

While habitat change did account for much of the reproductive failure observed at sampled pools (between 5 and 75% of occupied pools became unsuitable by June, Table 4), salamander offspring did go locally extinct at pools that remained suitable throughout the year. Extinction probabilities were low for egg masses in two of the 3 years (Fig. 3), but extinction probability estimates at pools that remained suitable between April and June visits varied considerably among years (inline image range: 0·00–0·61, Fig. 3). Pools that did produce metamorphs had high probabilities of supporting salamander reproduction the following spring: inline image and 1·0, respectively, for pools with metamorphs in 2006 and 2007.

Figure 3.

 Spotted salamander local extinction probability estimates for suitable habitats in Patuxent Research Refuge. Error bars indicate approximately 95% confidence intervals; confidence intervals are not given for parameters that are near 0.


In the above model development, changes in both habitat state and species occurrence have been assumed as stochastic. In some situations, it may be reasonable to consider some transitions to be deterministic which can be easily accomplished within this framework by setting the respective probability for the deterministic transition to be 1·0 and the other transitions associated with this state to be 0·0. Such a constraint may be applied to all sampling units or only to a selection of sampling units as required (e.g. the probability of reverting from ‘mature forest’ to ‘clear cut’ could be set equal to 1·0 if the timber at a sampling unit is harvested between seasons). Similar reasoning could be used for occupancy state, as discussed above in the context of suitable and unsuitable habitats. If a particular habitat type is truly unsuitable for the target species, then a sample unit in that habitat will have zero probability of being occupied by the species.

We have expressed both habitat and species occurrence transitions as first-order Markov processes in which the probability of being in a particular state at time t+1 depends solely on state at time t. In some cases, vegetation succession following disturbance occurs in such a way that state transition probabilities are best described as depending on the number of years that the vegetation has been in the current state or the number of years since disturbance (e.g. Beckwith 1954; Odum 1960). To deal with such situations, it would be possible to develop models explicitly for higher-order Markov processes, although such models are likely to be relatively data hungry. Another possibility is to use a first-order process (our model above), but redefine habitat state to include both vegetation class and time (e.g. years) the patch has spent in this class. Such models might be especially useful for species such as the Florida scrub jay, Aphelocoma coerulescens, that show the best demographic performance in transient habitats that are characterized by shrub height categories that vary as a function of time since last burning (Breininger & Carter 2003).

Experiences (our own and those of others) with similar types of analytic methods indicate that it would not be uncommon to find multiple local maxima on the likelihood surface when using maximum likelihood techniques. That is, the computer algorithms converge to a local maximum on the likelihood surface, but not to the global maximum. Practitioners should be aware of this point and investigate the stability of their estimates by rerunning analyses using alternative starting values for the algorithms, or consider using more general optimization routines such as simulated annealing, to verify that the global maximum has been determined. In a Bayesian framework, similar behaviour may translate to multi-modality of posterior distributions and may cause a lack of convergence when using Markov chain Monte Carlo procedures, as the chains may periodically ‘jump’ between different ranges of values that appear reasonable solutions.

MacKenzie et al. (2010) noted the similarities between the multi-state, multi-year occupancy framework used here and analytic methods developed for multi-state capture–recapture applications with state uncertainty (e.g. Kendall, Hines, & Nichols 2003; Nichols et al. 2004; Pradel 2005; Conn & Cooch 2009). Actually, all occupancy models and all capture–recapture models for open populations can be viewed as hidden Markov models with state uncertainty. Recognition and exploration of this fact may aid practitioners to understand these new methods and allow tools developed for one area to be applied to the other with little or no modification. These kinds of similarities among classes of models also extend to hidden Markov models that have been successfully applied in other disciplines.

The model and parameterization developed above are very general in nature and can be used to address a great many ecologically relevant hypotheses. For example, the ability to allow habitat transition probabilities to be different depending on whether the species is present or absent from the unit in season t allows one to consider how a species may impact a landscape. For example, herbivore grazing can promote plant growth and nutrient cycling in ways that increase habitat quality (McNaughton 1983; McNaughton, Banyikwa, & McNaughton 1997), whereas overgrazing and seed dispersal of noxious plants by animals may promote habitat deterioration. The nature of the processes underlying the dynamics of either habitat or species occurrence can also be investigated. By applying the appropriate constraints, one can consider models where the system dynamics are static, random or Markovian in nature (MacKenzie et al. 2006).

It is important that habitat dynamics be considered during the design phase of the study or monitoring programme. The most obvious design issue is to collect data on both focal species occupancy and also habitat state. Selection of sample units relative to habitat state should also be considered. For example, when a region consists of suitable and unsuitable habitats for a species, selecting sampling units only from areas that are currently considered to be suitable will not necessarily result in a design that provides reliable information over time, as areas that are currently unsuitable may become suitable, and possibly occupied, in the future. Such changes are important to accurately describe system state and dynamics; hence, the sample units should be selected in a way that permits inference about such changes (e.g. by allocating sample effort to area of both suitable and unsuitable habitat states). Another important design consideration is with respect to definition of a sampling unit. Often, habitat patches may appear to be discrete, leading to natural definitions of sample units (e.g. pools within a wetland complex); however, because of system dynamics, these patches may merge or separate over time thus creating confusion about site definition. For example, three discrete pools in one year might naturally be regarded as sampling units, whereas in a wet year, the pools may be merged and be defined as a single unit. In such situations, it may be appropriate to instead define sampling units in terms of grid cells to ensure consistency through time, with an assessment of the habitat state in a cell each sampling season (e.g. wet or dry).

The integrated habitat and species occurrence dynamics model described above provides a useful framework for addressing a wide range of ecological questions of scientific and management interest. Importantly, this modelling approach accounts for the fact that sampling methods are often imperfect for determining species occurrence with the likelihood of false absences. Application of this technique should yield more reliable inferences about habitat and species occurrence dynamics.


We thank all the biologists and technicians at Canaan Valley National Wildlife Refuge, especially Ken Sturm and Marquet Crockett. We also thank Evan Grant and Sandra Mattfeldt, researchers associated with US Geological Survey Northeast Amphibian Research and Monitoring Initiative, for helpful discussion and data from Patuxent Research Refuge. This is contribution number 375 of the Amphibian Research and Monitoring Initiative (ARMI) of the US Geological Survey. We thank Evan Cooch and Mark Lindberg for constructive comments on the initial manuscript.