A hierarchical Bayesian approach to multi-state mark–recapture: simulations and applications


  • Anna M. Calvert,

    Corresponding author
    1. Department of Biology, Dalhousie University, Halifax NS, Canada B3H 4J1;
    2. Atlantic Cooperative Wildlife Ecology Research Network & Biology Department, Acadia University, Wolfville NS, Canada B4P 2R6;
    Search for more papers by this author
  • Simon J. Bonner,

    1. Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby BC, Canada V5A 1S6;
    Search for more papers by this author
    • These three authors made equal contributions to this work.

  • Ian D. Jonsen,

    1. Bedford Institute of Oceanography, Fisheries and Oceans Canada, Dartmouth NS, Canada B2Y 4A2; and
    Search for more papers by this author
    • These three authors made equal contributions to this work.

  • Joanna Mills Flemming,

    1. Department of Mathematics and Statistics, Dalhousie University, Halifax NS, Canada B3H 3J5
    Search for more papers by this author
    • These three authors made equal contributions to this work.

  • Sandra J. Walde,

    1. Department of Biology, Dalhousie University, Halifax NS, Canada B3H 4J1;
    Search for more papers by this author
  • Philip D. Taylor

    1. Atlantic Cooperative Wildlife Ecology Research Network & Biology Department, Acadia University, Wolfville NS, Canada B4P 2R6;
    Search for more papers by this author

*Correspondence: E-mail: anna.calvert@dal.ca


  • 1Mark–recapture models are valuable for assessing diverse demographic and behavioural parameters, yet the precision of traditional estimates is often constrained by sparse empirical data. Bayesian inference explicitly recognizes estimation uncertainty, and hierarchical Bayes has proven particularly useful for dealing with sparseness by combining information across data sets.
  • 2We developed a general hierarchical Bayesian multi-state mark–recapture model, tested its performance on simulated data sets and applied it to real ecological data on stopovers by migratory birds.
  • 3Our hierarchical model performed well in terms of both precision and accuracy of parameters when tested with simulated data of varying quality (sample size, capture and survivorship probabilities). It also provided more precise and accurate parameter estimates than a non-hierarchical model when data were sparse.
  • 4A specific version of the model, designed for estimation of daily transience and departure of migratory birds at a mid-route stopover, was applied to 11 years of autumn migration data from Atlantic Canada. Hierarchical estimates of departure and transience were more precise than those derived from parallel non-hierarchical and frequentist methods, and indicated that inter-annual variability in parameters suggested by these other methods was largely due to sampling error.
  • 5Synthesis and applications. Estimates of demographic parameters, often derived from mark–recapture studies, provide the basis for evaluating the status of species at risk, for developing conservation and management strategies and for evaluating the results of current protocols. The hierarchical Bayesian multi-state mark–recapture model presented here permits partitioning of complex parameter variation across space or time, and the simultaneous analysis of multiple data sets results in a marked increase in the precision of estimates derived from sparse capture data. Its structural flexibility should make it a valuable tool for conservation ecologists and wildlife managers.


Demographic trait estimation has been greatly facilitated by the monitoring of individually marked animals over time, via live recaptures/observations and dead recoveries (hereafter all termed ‘recapture’), providing quantitative information for wildlife conservation and management. Mathematical mark–recapture models based upon these encounters were originally devised for the study of abundance and survivorship (Cormack 1964; Jolly 1965; Seber 1965) and subsequently modified to address a broader variety of ecological questions (e.g. Pollock et al. 1990; Lebreton et al. 1992). The multi-state model structure (Arnason 1972, 1973; Schwarz et al. 1993), in particular, permits the estimation of additional parameters including recruitment (Pradel 1996), breeding probability (Schwarz & Arnason 2000), harvest mortality (Calvert & Gauthier 2005), and population growth rate (Caswell & Fujiwara 2004). The leading software package for implementation of these models is the program MARK (White & Burnham 1999), and related methods continue to be developed (e.g. Williams, Nichols & Conroy 2002; Bonner & Schwarz 2006).

Traditional frequentist methods of estimating parameters from mark–recapture data are sensitive to low survival and capture probabilities, small sample sizes, and the number of sampling intervals, such that sparse data greatly limit the precision of parameter estimates (Pollock et al. 1990; O’Brien, Robert & Tiandry 2005; Morris et al. 2006). However, greater analytical power is available through a Bayesian approach (Harwood & Stokes 2003; Gelman et al. 2004), where (i) parameters are considered random variables rather than fixed unknown values, (ii) prior knowledge about parameter distributions can be directly incorporated into estimation, and (iii) data are used to estimate the probability of a given hypothesis. Bayesian methods permit greater precision of parameter estimates due to incorporation of prior information (McCarthy & Masters 2005), explicit recognition of uncertainty (Harwood & Stokes 2003), and enhanced evaluation of complex variation from sparse data (Clark et al. 2005; Clark & Gelfand 2006). These advantages, in combination with computational advances, have driven a recent expansion of Bayesian methods in ecology (Ellison 2004; Clark 2005), including applications to mark–recapture modelling (e.g. Poole 2002; Gimenez et al. 2007; Dupuis & Schwarz 2007).

Nevertheless, practical constraints in traditional mark–recapture modelling frameworks may limit the analysis of complex data structures. For instance, the ‘robust design’ family of models (Pollock 1982; Kendall, Pollock & Brownie 1995) stratify time intervals into primary and secondary sampling periods (e.g. year and day, respectively), but require that the population be closed to immigration and emigration within primary sampling periods. Recent extensions allow estimation of short-term survivorship (e.g. across days within a year; Schwarz & Stobo 1997), but the multi-state robust design model does not permit state-transitions within primary sampling periods. Modelling of complex multi-state mark–recapture data could therefore benefit from additional structural flexibility, such as that available through a hierarchical Bayesian framework (Link et al. 2002; Clark & Gelfand 2006; Jonsen, Myers & James 2006).

Hierarchical Bayes accommodates stochasticity at multiple levels in a manner similar to frequentist random-effects models (Clark 2005; Zheng et al. 2007), and has been applied to studies of animal movement (Jonsen, Myers & Flemming 2003), species richness (Kéry & Royle 2008), and metapopulation dynamics (Royle & Kéry 2007). It is especially advantageous over non-hierarchical models when parameter variation requires partitioning across several related spatial or temporal replicates (Clark et al. 2005; Link & Barker 2005; Royle & Dorazio 2008).

Our need for such a flexible multi-scale approach arose during a study based upon daily monitoring of songbirds at an autumn migration stopover site over consecutive years. In a previous study (Calvert, Taylor & Walde 2009), we used a multi-state mark–recapture model formulation derived from Schaub et al. (2004) to estimate annual values of two stopover parameters: departure (analogous to mortality), and transience (analogous to movement). However, small within-year sample sizes limited the precision of the estimates. We were interested in using a hierarchical Bayesian analysis to reduce the impact of such sparseness by ‘borrowing’ information across different years, thus providing a balance between an approach that ignores variance among years and one where years are analysed independently. Additionally, we sought a model that could evaluate stopover decisions across two temporal scales (daily and annual variation), with the potential to further partition variance across other scales (e.g. among stopover sites, between taxonomic groups). Beyond this specific application, we perceived a demand for an accessible multi-state mark–recapture framework allowing flexible hierarchical structuring and able to deal with sparse ecological data.

The objectives of this study were therefore to: (i) build a hierarchical Bayesian multi-state mark–recapture model framework flexible enough for application to any system with a nested sampling scheme (e.g. daily monitoring within a season, conducted across years) or other hierarchical structuring; (ii) assess the accuracy and precision of estimates obtained with this model at varying levels of data quality and sample size; (iii) verify that the hierarchical structuring in this model permits improved estimation from poor-quality data relative to non-hierarchical methods; and (iv) apply this model to the assessment of stopover decisions, using 11 years of daily migration monitoring data.

Model structure and notation

multi-state mark–recapture models

Multi-state mark–recapture analysis was first suggested by Arnason (1972, 1973), and extended by Hestbeck, Nichols & Malecki (1991), Brownie et al. (1993), and Schwarz, Schweigert & Arnason (1993). Based upon the Cormack–Jolly–Seber model that estimates survivorship (ϕ) while accounting for probability of encounter (p), the multi-state formulation additionally permits individuals to move stochastically among ‘states’ with probability ψk,j (movement from state k to j), where each state may be characterized by different ϕ or p probabilities. If we also allow for temporal variation in parameters, animals alive at time t in state k will be captured with probability pk,t, survive to time t + 1 with probability ϕk,t and will shift to state j by time t + 1 with the probability ψk,j,t. The movement parameter ψ was originally devised to distinguish mortality from emigration out of the study area (i.e. where states are geographical locations), but it can equally be used to represent other state transitions (Lebreton & Pradel 2002; White et al. 2006). Similarly, the definition of the ϕ parameter is not restricted to true demographic survivorship, and hence, the combination of non-standard ϕ and ψ interpretations allows for the modelling of diverse ecological scenarios beyond estimation of survival and emigration (e.g. Schaub, Liechti & Jenni 2004). As with single-state models, multi-state mark–recapture models rely upon some critical assumptions, such as: (i) marks are not lost, and all marks and states are correctly identified; (ii) the marking of individuals does not affect their probabilities of capture, survival or movement; (iii) every individual in state k present in the population at time t is subject to the same capture probability pk,t, the same survival probability ϕk,t, and the same vector of movement probabilities ψk,j,t; and (iv) the fate of each individual is independent of the fates of others. Pollock et al. (1990) and Williams et al. (2002) further discuss model assumptions and likelihoods, and Clark et al. (2005) consider how some assumptions may be relaxed under hierarchical Bayes; our model did not relax these assumptions.

hierarchical bayesian approach

Multi-state mark–recapture estimation using Bayesian methods has been described by Dupuis (1995), King & Brooks (2002), Clark et al. (2005), and Gimenez et al. (2007). Our approach builds on their work. We describe below the components of our hierarchical multi-state mark–recapture model as implemented within the software packages WinBUGS 1.4 (Lunn et al. 2000) and OpenBUGS (Thomas et al. 2006), hereafter collectively referred to as ‘BUGS’ (i.e. the same model notation is used in both platforms).


  • Y = total number of ‘years’y sampled

  • Ty = total number of ‘days’t sampled in year y

  • K = total number of possible states k (or j)

  • Ny = total number of individuals i marked in year y

  • cy,i = first capture (i.e. marking) date of individual i in year y

We define ‘years (y)’ and ‘days (t)’ as the primary and secondary sampling periods, respectively, where Y and Ty represent the total number of sampling intervals at each level, but these could be generalized to any other data structure as long as t is nested within y. The number of individuals marked annually can vary across years (Ny), and the first encounter date of each individual within each year is also specified (cy,i).

Note that unlike the robust design models (Pollock 1982; Kendall et al. 1995), our model considers each year's data to be conditionally independent (i.e. with shared hyperprior distributions). No assumption is made about the behaviour across years for individuals that may be sampled in multiple data sets.

state and observation vectors

  • xy,i,t = observed state of individual i at time t within year y

  • zy,i,t = true state of individual i at time t within year y

  • wy,i,t = observation indicator for individual i at time t within year y

The two state vectors defined in this model are x, the time-series of observed states (1: K) for each individual (i.e. the encounter histories); and z, a latent variable representing the time series of true states for each individual [irrespective of whether it was observed, but known (i.e. equal to x) when w = 1]. Both x and z consist of values ranging between 1 and K + 1; we used a non-0 value (K + 1) for the unobserved state in x and the dead state in z, for notational convenience. The observation vector w is a binary indicator of whether an individual was observed [irrespective of state, i.e. w = 1 if x ∈ (1, ... , K); w = 0 otherwise].


  • ϕy,k = probability, within year y, that an individual that is alive and in state k survives a 1-day interval, independent of whether it changes state in that interval (i.e. assumed equal across all 1-day intervals)

  • py,k = probability, within year y, that an individual that is alive and in state k will be observed on a given day

  • ψy,k,j = probability, within year y, that an individual that is alive and in state k will move to state j over a 1-day interval (i.e. assumed equal across all one-day intervals)

Parameter definitions apply when k ∈ (1, ... , K); see below for transitions to and from K + 1. Extensions of the model could accommodate parameter variation among secondary sampling periods (t), but are not shown.



The general form of the likelihood function for this hierarchical multi-state mark–recapture model is:

image(eqn 1)

where k = zy,i,t−1 and j = zy,i,t.

Note that (i) the model conditions on time of first capture (i.e. starting at t = cy,i + 1), and it is assumed that the true state is known at the time of first capture (i.e. zy,i,c = xy,i,c); (ii) the probability of encounter history data x is equal to the joint probability of observation vector w and state vector z; and (iii) the likelihood separates process from observations, in parallel to state-space models (e.g. Jonsen et al. 2003, Gimenez et al. 2007).

Process model

The first likelihood component represents the process, i.e. the true state z in one time period conditional upon the true state in the previous time period (zy,i,t|zy,i,t−1). This is defined by a categorical distribution with probabilities given by:

image(eqn 2)

Thus, the true state z of an individual i in year y at time t depends on its previous state, as well as the combined probability that it survived in that state (zy,i,t−1) and moved from that state to the current state (zy,i,t,1). The second category in equation 2 reflects individuals that do not survive from t − 1 to t, while the final two categories simply state that dead animals (i.e. those no longer in the study population) must remain dead.

Observation model

The second likelihood component represents the observation model, that is the probability of observation indicator time series wy,i,t conditional upon the latent process time series zy,i,t (i.e. wy,i,t|zy,i,t). This probability was defined by a Bernoulli distribution, given by:

P(wy,i,t|zy,i,t = k) ~ Bernoulli [py,k]
  for k = 1, ... , K; note that py,K+1 = 0    
(eqn 3)


Bayesian inference requires the definition of prior distributions for all parameters to be estimated (in this case ϕ, p and ψ). When possible, prior distributions should describe the researchers’ previous knowledge about parameter values, but uninformative (or weakly informative) prior distributions may be used if little is initially known about the parameters, or to reduce the influence of priors on estimation. Daily sampling within years naturally leads to a hierarchical model with two levels, where prior distributions are specified in the form of ‘hyperpriors’ (Carlin & Louis 1996). At the first level, we assume that the parameters in each primary period (e.g. daily capture probabilities in each year) form a random sample from a distribution with unknown mean and variance (the hyperparameters). At the second level, we model these means and variances as random samples from a further distribution with unknown mean and variance (the hyperpriors); these values reflect the prior parameter knowledge (if any).

Prior distributions for all parameters were defined on the logit scale so that the parameters would be naturally bounded between 0 and 1. Across all years, parameters of the same type were assigned normal priors on the logit scale with equal, but unknown, mean and variance; for example, the capture probability of individuals in state k in year y was assigned the prior distribution: logit(py,k,t)~N(µpk, 1/τpk), with mean and variance state-dependent but equal in all years (precision τ, and not variance 1/τ, is specified in BUGS notation). These parameters (the hyperparameters: µpk, µϕk, µψk, τpk, τϕk and τψk) were then assigned further distributions (hyperpriors) dependent on state alone. We used the conjugate distributions µk~N(0,1000) and τk~Gamma(0·001,0·001). As all probabilities of movement must sum to 1 for each starting state, priors were specified for ψ1,1 and ψ2,2, with ψ1,2 = 1 − ψ1,1 and ψ2,1 = 1 − ψ2,2; in cases with more than two states, ψ would represent a multinomial (instead of binomial) process and priors would have to be accordingly defined.

alternative and non-hierarchical model structures

The BUGS code for the hierarchical multi-state mark–recapture model described above, where parameters vary with k and y but not with t (giving the equivalent of an ‘annual average’ estimate for each daily parameter) is outlined in Supporting Information, Appendix S1. In order to test specific biological hypotheses, alternative versions of this general model can be constructed to reflect differences in the structure of parameter variation. For instance, the model could be specified to reflect daily changes in parameters by simply adding another index to the parameters (e.g. ϕy,k,t). Similarly, reducing the sources of variation can be accomplished by removing an index from a particular parameter (e.g. no state or daily variation on survivorship: ϕy), or fixing parameters at a known value (see case study below). Effects of external covariates can also be modelled: an effect of daily wind speed st upon daily variation in survivorship, for example, could be defined as ϕt = B0 + B1*st (where B0 is the intercept and B1 is the slope, each defined by a probability distribution). Finally, a non-hierarchical equivalent of the model can be described for analysis of a single year's data, using prior distribution parameters µ = 0 and τ = 0·667 instead of the hyperparameters defined in Supporting Information, Appendix S1, as follows:

image(eqn 4)

Assessment of model performance

simulated data

A main objective of this hierarchical model was to reduce the problems of parameter bias and imprecision posed by limited data (e.g. O’Brien et al. 2005), and specifically by small sample size, low capture probability, or low survivorship. We therefore designed a three-step simulation study to assess model performance with varying data quality, based upon a generalized model where K = 2, Y = 10, T = 10, and all cy,i = 1, with model structure [ϕ(y,k) p(y) ψ(y,k,j)]. All simulations were built around a hypothetical situation where state k = 1 represented a ‘sink’ state (i.e. lower survivorship and more immigration than emigration) and k = 2, a ‘source’ state (ϕ1 < ϕ2 and ψ2,1 > ψ1,2). True parameter values (see Supporting Information, Appendix S2) for each of the 10 years were independently drawn from logit-normal distributions with known means (described below) and variance (between 0·005 and 0·01), so that true and estimated values had the same hierarchical structuring.

First, to verify that model performance was satisfactory when capture probability, survivorship and sample size were high, we created 25 sets of simulated hierarchical data based on ‘good’ parameter values. Each data set was characterized by: Ny = 200 per state at t = 1; mean annual survivorship probabilities ϕy,1 = 0·70 and ϕy,2 = 0·90; mean annual capture probabilities py = 0·80; and mean annual movement probabilities ψy,1,2 = 0·15 and ψy,2,1 = 0·35.

Secondly, to test model performance with ‘poor’ data quality, we created 25 sets of simulated hierarchical data for each of 12 different scenarios representing combinations of low (mean py = 0·20) or very low (py = 0·05) capture probability, high (mean ϕy,1 = 0·70, ϕy,2 = 0·90), moderate (ϕy,1 = 0·40, ϕy,2 = 0·60) or low (ϕy,1 = 0·10, ϕy,2 = 0·30) survivorship, and small (Ny = 50 individuals marked in each state at t = 1) or large (Ny = 200 per state at t = 1) sample size. The 12 scenarios resulted from all combinations of each of these three factors (3ϕ × 2p × 2N), where all true mean movement probabilities were ψy,1,2 = 0·15 and ψy,2,1 = 0·35.

Thirdly, to determine whether the hierarchical model structure would improve estimation in years of sparse data, we created 25 sets of simulated hierarchical data for a scenario with ‘mixed’ data quality (as measured by capture probability) among years. Capture probability was taken from the low (mean py = 0·20) distribution in seven years (y = 1–4, 8–10), and from the very low (mean py = 0·05) distribution in three years (y = 5–7); survivorship was moderate (means ϕy,1 = 0·40, ϕy,2 = 0·60), sample sizes small (Ny = 50/state at t = 1), and mean movement probabilities were ψy,1,2 = 0·15 and ψy,2,1 = 0·35. We compared parameter estimates for these data from both the hierarchical model described above and the non-hierarchical equivalent (eqn 4).

convergence and model fit

Model convergence occurred most slowly with the poorest data quality, and therefore, we based our convergence tests on our weakest data sets: very low capture probability (mean py = 0·05), low survivorship (means ϕy,1 = 0·10, ϕy,2 = 0·30), and small sample size (Ny = 50 per state). We monitored the scale reduction factor (the factor by which posterior intervals are expected to shrink if simulation continued; Gelman & Rubin 1992) based on 3 chains (each with different initial values), and found that it closely approached the desired value of 1 (< 1·05 for all parameters) within the first 10 000 iterations. Consequently, for all simulated data sets we set a burn-in run of 10 000 iterations, and estimated parameters based on 50 000 subsequent iterations.

We do not address the issue of model fit and selection in this study. General methods for assessing model fit and for determining the ‘best’ model have been developed in detail (e.g. Burnham & Anderson 2002), and specific Bayesian model selection tools are available (e.g. Spiegelhalter et al. 2002). In particular, Reversible Jump MCMC methods involve the estimation of posterior model probabilities in addition to posterior parameter distributions (King & Brooks 2002; Gimenez et al. 2009), and represent a natural Bayesian approach to model selection and uncertainty; they would be a useful next step with this model.

estimation accuracy and precision

The accuracy and precision of parameter estimation based on the good simulated data (Ny = 200 per state, mean py = 0·80, mean ϕy,1 = 0·70 and ϕy,2 = 0·90) are illustrated in Fig. 1. For all three parameters, across both states, the posterior medians of annual estimates and of the hierarchical means (Hm) were very close to the true mean values used to simulate the data. Additionally, the 95% credible intervals always included the true mean values and were narrow around the posterior median, though slightly wider for capture probability p than for the other parameters. Finally, the estimates appeared to capture inter-annual variation in parameter values very well, with little ‘shrinkage’ of the annual medians toward the hierarchical mean.

Figure 1.

Posterior medians (inline image) and 95% credible intervals (bars) of movement (ψ), survivorship (ϕ) and capture probability (p), for 10 years and hierarchical mean Hm, based on 25 replicate sets of ‘good’ simulated data (Ny = 200 per state at t1, mean p1 = p2 = 0·80, mean ϕ1 = 0·70 and ϕ2 = 0·90; mean ψy,1,2 = 0·15, ψy,2,1 = 0·35); black symbols are for p1, ϕ1, ψ12, grey for p2, ϕ2, ψ21. True parameter values used to simulate data (x) and the mean of the hierarchical distribution used to simulate them (dotted lines) are shown.

For the 12 scenarios using poor quality data, Fig. 2 illustrates the estimation bias (posterior median – true value) for each parameter resulting from each of the 25 replicate data sets across the varying levels of survivorship, capture probability and sample size. Bias was greatest with the most sparse data (small N, very low p and low ϕ) and decreased with improving data quality; mean bias was < 0·10 in 32/36 parameter-scenario combinations, with the following four exceptions: estimates of movement (ψ) when N = 50, mean py = 0·05 and mean ϕy = 0·10 and 0·30; estimates of capture probability (p) when N = 50, mean py = 0·05 or 0·20, and mean ϕy = 0·10 and 0·30; and estimates of p also when N = 200, mean p = 0·05 and mean ϕy = 0·10 and 0·30. Additionally, shrinkage of annual estimates toward the hierarchical mean was stronger with sparse data and declined with increasing data quality (Supporting Information, Appendix S2). Bias of the estimates of hierarchical mean (Hm– mean of true annual values), and the width of their 95% credible intervals, are shown in Supporting Information, Appendix S3. As with the annual parameters, Hm estimates were close to true values (bias < 0·10) for most of the parameter-scenario combinations (29/36) and bias decreased as data quality improved (higher N, p, ϕ), though again bias was greatest for capture probability p. Precision of Hm estimates also improved with data quality (Supporting Information, Appendix S3).

Figure 2.

Estimation bias (posterior median – true value) of ψ (top), ϕ (middle) and p (bottom) for 25 replicate data sets across each level of the cross-design ‘poor data’ simulation study; horizontal grey lines indicate accurate estimation, i.e. bias = 0. The design involved three levels of mean survivorship [high (ϕy,1 = 0·70, ϕy,2 = 0·90), med (ϕy,1 = 0·40, ϕy,2 = 0·60) or low (ϕy,1 = 0·10, ϕy,2 = 0·30)], two levels of mean capture probability [low py = (0·20) or very low (py = 0·05)], and two levels of sample size [small (Ny = 50 individuals marked in each state at t = 1) or large (Ny = 200 per state at t = 1)]; all mean movement values were ψy,1,2 = 0·15, ψy,2,1 = 0·35.

Finally, the hierarchical and non-hierarchical estimates based on the data of mixed quality (mean py = 0·20 for most years but 0·05 for years 5–7) are illustrated in Fig. 3 (estimates shown for φy,2 and ψy,1,2; see Supporting Information, Appendix S2 for all parameter estimates). Relative to non-hierarchical estimates, the hierarchical estimates were much less variable among years, closer to true values, and had narrower credible intervals. For the three years with very low p, estimates from the non-hierarchical model were less accurate, and with larger 95% credible intervals, than in years with higher p; however, the hierarchical estimates from these three years were also further shrunk toward the hierarchical mean. Posterior medians, 95% credible interval size and true parameter values for all three components of the simulation study are presented in full detail in Supporting Information, Appendix S2.

Figure 3.

Survivorship (ϕ2, top) and movement (ψ12, bottom) probabilities (median and 95% credible intervals) estimated from hierarchical (black, with hierarchical mean Hm shown by inline image) and non-hierarchical (grey) Bayesian multi-state mark–recapture models, relative to true parameter values (x), for ‘mixed’ data. Estimates are means from 25 replicate sets of simulated data with N = 50 individuals marked/state at day 1, moderate true survivorship (mean ϕy,1 = 0·40, ϕy,2 = 0·60), and mean true movement probabilities of ψy,1,2 = 0·15, ψy,2,1 = 0·35. Capture probabilities were drawn from distributions with mean p = 0·05 for years 5,6,7 (large symbols), but mean p = 0·20 for years 1–4, 8–10 (small symbols), to compare estimates between years of variable data quality.

Specific ecological application

migratory stopover decisions

We applied this model to an analysis of migratory stopover decisions by songbirds as described in Calvert et al. (2009). The data derive from autumn migration monitoring of temperate-breeding songbirds at an island stopover site in Atlantic Canada (Bon Portage Island, NS, 43°28′N, 65°44′W), where daily mark–recapture data on stopped individuals were collected from 1996 to 2006. Each year, between 52 and 562 (median 207) hatch-year yellow-rumped ‘Myrtle’ warblers (Dendroica coronata, hereafter MYWA) were leg-ringed during peak autumn migration (23 September–31 October) as part of the constant-effort mist-netting programme of the Canadian Migration Monitoring Network, and all subsequent days’ recaptures were noted.

Derived from Schaub et al. (2004), mark–recapture analysis of stopover decisions involves a three-state model, with an ‘initial’ state (first capture only), a ‘non-transient’ state (potential for subsequent recapture), and a ‘transient’ state (departure within 24 h preventing recapture). By considering stopover and departure analogous to ‘survival’ and ‘mortality’ respectively, we constrain survivorship in the initial state to be ϕ1 = 1 and transient survivorship to be ϕ3 = 0; we thus only estimate survivorship (i.e. 1-departure) for non-transient individuals, ϕ2. Similarly, we only estimate capture probability for non-transient birds (p2); capture probability is inestimable for birds in the initial state, and is 0 by definition for transient birds, and therefore, both p1 and p3 were set to 0 in the model notation. Finally, movement probabilities are also restricted: initial individuals cannot remain as initials (ψ1,1 = 0), and instead move to the non-transient (ψ1,2) or transient (ψ1,3, ‘transience’) states. The move to be transient or not is a permanent one, such that once in state 2 or 3, individuals remain in that state (ψ2,2 = 1 and ψ3,3 = 1); see Schaub et al. (2004) for parameter vectors and transition matrices.

For the study of stopover decisions, therefore, only three parameters are estimated: stopover (i.e. one-departure probability) for non-transient birds, ϕ2; capture probability of stopping birds, p2; and probability of transience upon arrival, ψ1,3 (or of non-transience, ψ1,2 = 1 − ψ1,3). In BUGS, specific constraints are reflected within the model code (see Supporting Information, Appendix S1): all constrained parameters are fixed to 0 or 1 as described above and not given prior distributions. As we were interested in inter-annual variability in transience and departure probabilities of migratory passerines (Calvert et al. 2009), we built a model constraining daily ψ and ϕ to be equal across days t within each year y. We were also interested in the influence of individuals’ fuel load upon transience and departure, as migrants’ decisions may be affected by the quantity of fuel (in the form of fat) they are carrying, particularly prior to flight across barriers such as open ocean (Alerstam & Lindstrom 1990). We divided the ringed sample between birds that had a ‘high’ fat-load (furculum at least half-full of fat, some with additional fat on breast, abdomen or below wings) or ‘low’ fat-load (furculum less than half-full of fat or empty) at the time of marking (see Calvert et al. 2009).

model structure

These sources of variability were incorporated into a general MYWA stopover model with structure [ϕ(y*k) p(y*k) ψ(y*k)] (see Supporting Information, Appendix S1). Structural features of the data set included Y = 11, K = 3, Tyranging from 28 to 39 (in some years ringing efforts were halted early, but parameters were only estimated for the duration of the ringing season), and Ny ranging from 15–272 for high-fat birds, and 34–290 for low-fat birds. For each fat-group, we ran a 20 000-iteration burn-in period, and then obtained parameter estimates based upon a further 50 000 iterations, with both a hierarchical (Supporting Information, Appendix S1) and a non-hierarchical version of this same model (as outlined above). For comparison with BUGS estimates, we also estimated the parameters with an equivalent frequentist multi-state model in MARK ([ϕ(f*k) p(f*k) ψ(f*k)], i.e. fat-load-(‘high’ vs. ‘low’) and state-specific parameters, with each year run separately; estimates from Calvert et al. 2009).

parameter estimates

Figures 4 and 5 illustrate the posterior median and 95% credible intervals for annual MYWA departure (ϕ) and transience (ψ) probabilities for each of the fat groups from both the hierarchical and non-hierarchical BUGS models, and the hierarchical mean of each parameter across the 11 years of data (Hm). Estimates of ϕ and ψ from MARK models are also presented, although boundary estimates (0 or 1, typically in years of small sample size) are not shown. Median departure probabilities were generally higher for high-fat than low-fat birds, as previously seen (Calvert et al. 2009), but the fat-load difference was more evident in the hierarchical model estimates (where all years’ values were derived from a common distribution) than in the non-hierarchical estimates (where each year's value was estimated independently). Wide credible intervals, however, provided little evidence for a fat effect on transience. Estimates of capture probability, p, were low and overlapping between the two fat groups, with hierarchical means of 0·040 (posterior median; 95% credible intervals 0·021–0·092) and 0·053 (0·033–0·085) for high- and low-fat, respectively.

Figure 4.

Annual departure probabilities (1 −ϕy,2; median and 95% credible intervals) for high-fat (top) and low-fat (bottom) Myrtle warblers (Dendroica coronata) monitored at Bon Portage Island, Canada, from 1996–2006, estimated from hierarchical (black) and non-hierarchical (grey) Bayesian models; ‘Hm’ indicates the hierarchical mean. Also shown are annual number of individuals ringed (at bottom of plot), and mean estimates (± SE) from frequentist MARK models (inline image), when estimable (boundary estimates, i.e. 0 or 1, are not shown; from Calvert et al. 2009).

Figure 5.

Annual transience probabilities (ψy,1,3; median and 95% credible intervals) for high-fat (top) and low-fat (bottom) Myrtle warblers (Dendroica coronata) monitored at Bon Portage Island, Canada, from 1996–2006, estimated from hierarchical (black) and non-hierarchical (grey) Bayesian models; ‘Hm’ indicates the hierarchical mean. Also shown are annual number of individuals ringed (at bottom of plot), and mean estimates (± SE) from frequentist MARK models (inline image), when estimable (boundary estimates, i.e. 0 or 1, are not shown; from Calvert et al. 2009).

Credible intervals on Bayesian parameter estimates and confidence limits on MARK estimates all showed a high degree of overlap for a given parameter within a particular year. Relative to non-hierarchical and MARK estimates, median hierarchical parameter estimates were ‘shrunk’ toward the hierarchical mean Hm, suggesting that much of the apparent inter-annual variability in estimates from the other methods could be due to sampling error (see Discussion). The 95% credible intervals around non-hierarchical annual estimates were generally larger than those from the hierarchical model, especially for departure and capture probabilities, indicating that the hierarchical modelling of several years’ data produced more precise parameter estimates than year-specific models; this difference was particularly evident in years where few individual birds were marked. Annual capture probability estimates (not shown) from the hierarchical model similarly showed narrower credible intervals and more shrinkage toward Hm than estimates from the non-hierarchical model, further suggesting important sampling error among years (see Discussion).


accuracy, precision, and hierarchical modelling

Our hierarchical Bayes multi-state mark–recapture model expanded upon previously developed models (e.g. Dupuis 1995; King & Brooks 2002; Clark et al. 2005). Using simulated data, we first tested our model under an idealized scenario where: (i) many individuals are marked (large N), (ii) individuals remain in the population a long time (high ϕ), and (iii) they are likely to be observed if present (high p). Under these conditions, the hierarchical model performed well, providing annual estimates and hierarchical means that were very close to true values, had narrow credible intervals, and accurately detected inter-annual variation. Estimates were particularly precise for the two parameters of usual biological significance (survivorship ϕ and movement probability ψ), and showed no bias for any of the parameters. Supplementary tests with alternate hyperprior distributions on parameter means suggested little sensitivity of posterior estimates to choice of priors (see Supporting Information, Appendix S4). We concluded that the model structure was properly defined, and that both accurate and precise estimation is possible with this approach if sufficient data are available.

The second component of the simulation study was designed to test how well our hierarchical model would perform under more realistic biological conditions. For instance, in previous work (Calvert et al. 2009), even the most abundant migratory passerines at Bon Portage Island were captured rarely in some years, resulting in annual sample sizes ranging between 8 and 579 individuals ringed during peak migration. Additionally, daily capture probabilities of migrating songbirds are typically low, often on the order of 2·3–10·0% (Bachler & Schaub 2007; Salewski, Thoma & Schaub 2007; Calvert et al. 2009). The poor-quality simulated data (with N = 50 or 200 per state, variable survivorship, and capture probabilities of p = 0·05 or 0·20) were thus chosen to better represent the sparse data common in real multi-state mark–recapture studies.

As expected, poor data reduced the precision of parameter estimates. There was relatively little bias in estimates of hierarchical means, however, except for a tendency for a positive bias when true values were very close to zero (i.e. for p and ψ). Annual estimates tended to underestimate inter-year variation, but correctly captured the trends (i.e. the direction of deviation away from the hierarchical mean). Bias and precision of capture probability estimates appeared particularly sensitive to survivorship, especially when low survivorship was combined with either small sample size or very low capture probability; the little information in the data resulted in a posterior distribution of p that was highly influenced by the prior.

An important benefit of a hierarchical approach, as seen in our simulations and as explained elsewhere (e.g. Kéry & Royle 2008; Royle & Dorazio 2008), is that estimates are less biased and more precise. This can come at a slight cost of underestimating inter-annual variation when data are sparse, because hierarchical models interpret deviation of individual parameter estimates that have low precision largely as sampling error (Efron & Morris 1975); deviation of more precise estimates is interpreted as process stochasticity. A non-hierarchical approach, however, can overestimate inter-annual variation by not accounting for sampling error. In the analysis of the Myrtle warbler data, the direction of inter-annual variation in hierarchical estimates was consistent with those from the non-hierarchical Bayes model and the frequentist MARK estimates but the median values were closer to Hm, reflecting the interpretation of small sample sizes and poor capture rate as parameter uncertainty rather than process variance (Efron & Morris 1975; Clark 2005). However, due to smaller credible intervals on annual estimates, the hierarchical model still provided stronger evidence that high-fat birds depart this coastal stopover sooner than low-fat birds.

model flexibility and applications to complex ecological questions

Our hierarchical Bayes model offers several advantages to the analysis of ecological mark–recapture data. Most obviously, the borrowing of information across hierarchical levels (in this case, years; Clark 2005; Jonsen et al. 2006) enables analysis of data sets too sparse to be analysed individually (e.g. Askey et al. 2007). Thus, while other studies have analysed passerine stopover behaviour separately for each year (e.g. Salewski & Schaub 2007), hierarchical Bayes permits the estimation of both a hierarchical mean (i.e. an average across years) and annual parameter values that might otherwise be inestimable or imprecise (Morris et al. 2006; Salewski et al. 2007). More broadly, this modelling approach may improve estimates of demographic parameters for endangered or elusive animals where sparse data (few individuals marked and low capture probabilities) are inevitable. Predictive probability distributions of future values, representing total variability in each parameter, can also be derived from hierarchical Bayes estimates.

The development of this hierarchical Bayesian model was driven by our need for modelling flexibility in multi-state analyses of songbird stopover behaviours, but its generality makes it available to many more diverse applications. The hierarchical time-structure (sampling days across study years) could be replaced with an alternate time-structure (e.g. weeks within seasons), or a spatial-structure where data are analysed hierarchically among study regions or populations. As with other multi-state models (reviewed in White et al. 2006), group structure could allow age or sex variation in parameters, states could represent geographical locations (e.g. Hestbeck et al. 1991) or reproductive stages (e.g. Nichols et al. 1994), and parameters could vary with temporal or individual covariates. Future analyses of stopover behaviour, for instance, may include additional hierarchical structuring among species or between adjacent stopover sites, or even alternative state definitions (e.g. fat-loads as states; Schaub 2006). Finally, although it was designed for a complex situation with multiple states, this model could easily be adapted to simpler study systems, such as single-state models or other mark–recapture variations, offering a valuable tool for diverse demographic studies.


Hierarchical Bayes accommodates complex variation and covariates (e.g. Zheng et al. 2007; Kéry & Royle 2008), and can improve parameter estimation relative to frequentist non-hierarchical methods (Poole 2002; Dupuis & Schwarz 2007). As a result, its value is increasing for both mark–recapture studies and other types of movement analyses (e.g. state-space modelling: Jonsen et al. 2003, 2006; Gimenez et al. 2007; Royle & Kéry 2007). The Bayesian multi-state mark–recapture model described here (in both its hierarchical and non-hierarchical variations) performed well across a range of data quality, yet estimation bias was high when low survivorship was combined with low sample size or low capture probability. Consequently, if survivorship is unknown (it is often a parameter of interest to be estimated) or expected to be low, effort should be made to increase both the number of individuals marked and the probability of capture (Salewski et al. 2007). At stopover, for instance, where capture probabilities are typically very low (Morris et al. 2006; Calvert et al. 2009), this might be accomplished by improving the placement or number of nets within the available habitat (e.g. Bachler & Schaub 2007) or by employing additional markers such as colour rings to increase the probability of re-observation.

Our application of hierarchical Bayes to multi-state mark–recapture studies is structurally flexible, amenable to the analyses of complex and unusual data at various spatial scales, and deals with sparseness by borrowing strength across all available data sets. Its structuring of parameter variation across time or space will therefore complement existing tools for the modelling of real ecological mark–recapture data that are critical to demographic quantifications in wildlife conservation and management.


Funding was provided by the Atlantic Cooperative Wildlife Ecology Research Network (ACWERN), Bird Studies Canada (Baillie Fund and Canadian Migration Monitoring Network), the Government of Canada's Climate Change Impacts and Adaptation Program, Bon Portage Island Field Station (Acadia University), Natural Sciences and Engineering Research Council (NSERC) Discovery Grants to PDT and JMF, and both a Canada Graduate Scholarship (NSERC) and a Killam Trusts Scholarship to AMC. Thanks to Chris Field, Charles Francis, Dylan Fraser, Coilín Minto, Greg Robertson, and Carl Schwarz for discussions, statistical advice, and comments on previous drafts.