## Introduction

Individual variation is a key driver of evolution and an important consideration in modelling the demographics of many populations. However, individual heterogeneity presents a challenge in the analysis of mark–recapture data – particularly when the goal is to estimate abundance. In practice, differences in the behaviour of individuals in a population may be modelled as functions of individual covariates or random effects. In either case, the likelihood function will include integrals to account for all possible values of the unobserved effects. These integrals may be difficult to compute if multiple covariates/random effects are included or if a single individual covariate/random effect changes over time, which makes evaluating the true likelihood for the entire population problematic.

Intractable likelihoods pose a general problem in statistics, and several solutions have been proposed within the Bayesian framework. We explore Monte Carlo integration within Markov chain Monte Carlo sampling (MCWM) to obtain inference from mark–recapture models with individual heterogeneity. While we focus on modelling, the effects of individual covariates, the same methods can be applied to models including random effects or a combination of the two.

One way to avoid the problem with intractable likelihoods is to estimate abundance with a conditional likelihood approach. Huggins (1989) and Alho (1990) presented methods for estimating the size of a closed population when the capture probability depends on an individual covariate. Likelihoods which condition on at least one capture are fit to the data from the marked individuals and used to estimate capture probability as a function of the covariate. Abundance is then estimated using a Horvitz–Thompson estimator. These methods were later extended to open population models by McDonald & Amstrup (2001). However, these models are restrictive and can only be used if the covariate is completely observed for the marked individuals (i.e. the covariate is constant or changes deterministically like age).

Alternatively, Bayesian inference via Markov chain Monte Carlo (MCMC) has been applied to fit models allowing for the effects of time-varying, individual covariates or other covariates that are only partially observed for the marked individuals. Dupuis (1995) applied Bayesian methods to model the effects of discrete covariates on survival of individuals in an open population (i.e. the multistate model). Following this, Pollock (2002) suggested that a Bayesian approach could be applied for the particular case of continuous, time-varying, individual covariates and noted that: ‘Bayesian methods automatically integrate out unobserved random variables using numerical integration or Markov chain Monte Carlo sampling methods’ (Pollock 2002, p. 97). Bonner & Schwarz (2006) applied Bayesian inference via MCMC to model the effects of time-dependent covariates on individual capture and survival probabilities in the Cormack–Jolly–Seber model. King *et al*. (2006) described a similar approach and provided methods of variable selection while Gimenez *et al*. (2006) incorporated semiparametric regression to allow for nonlinear effects of the covariate. Royle, Dorazio & Link (2007) and Royle (2009) later developed MCMC-based methods to make inference about the size of a closed population when capture probabilities vary among individuals. Their method is based on augmenting the observed data with a large number of zero capture histories representing a pool of individuals that may have been alive, but never captured and has become known as the data augmentation (DA) approach. This method is appealing because it provides a conceptually simple framework that can be applied to many models and is easily implemented in the BUGS language. More recently, Schofield & Barker (2011) and Royle & Dorazio (2012) have shown how the same methods may be applied to model open populations with individual heterogeneity. Alternatively, Bayesian inference regarding the size of an open or closed population with individual heterogeneity may be implemented with the reversible jump MCMC (RJMCMC) algorithm as described by King & Brooks (2008).

Our current work is motivated by our experiences applying DA and RJMCMC to a variety of mark–recapture data sets. Both DA and RJMCMC avoid the need for explicit integration by working with complete data likelihoods (CDL) in place of the observed data likelihood. These CDL are constructed by adding extra, unobserved random variables to the data that would simplify computation of the likelihood, if observed (Dempster, Laird & Rubin 1977; Gelman *et al*. 2003, section 7.2).

We have found that the chains constructed by these algorithms may be computationally inefficient1 in that they mix poorly and take a long time to generate a representative sample from the posterior distribution. This seems especially true when the models include time-dependent, individual covariates or other multidimensional covariates that make the likelihood difficult to evaluate numerically. All MCMC methods work by constructing a Markov chain that has the posterior distribution as its unique stationary distribution. Samples from the posterior distribution are generated by simulating sufficiently long realizations of the Markov chain, and these samples are used to estimate posterior summary statistics. The challenge with DA and RJMCMC is that a lot of time may be spent updating the extra variables added to the CDL when a small fraction of the population is captured and marked. Moreover, we have found that the chains can have high autocorrelation meaning that large samples are needed to estimate posterior summary statistics accurately.

We explore the use of MCWM as an alternative to these algorithms for fitting mark–recapture models with individual covariates. We focus on a simple, closed population model with one individual covariate as an example of the method and provide results of a simulation study comparing MCWM, DA and RJMCMC. We also apply our method to data on meadow voles (*Microtus pennsylvanicus*) collected at the Patuxent Wildlife Research Center in 1981 and 1982 (Nichols *et al*. 1992) and compare the results with DA and RJMCMC. Although this data was collected using a robust design, we only consider the information from the final primary period and model capture probability as a function of a vole's average observed body mass. Previous analysis of this data has shown a significant, positive relationship between capture probability and body mass (Schofield & Barker 2011), and abundance estimates that ignore this heterogeneity would be biased.