Using visual encounter data to improve capture–recapture abundance estimates

Capture-recapture studies are widely used in ecology to estimate population sizes and demographic rates. In some capture-recapture studies, individuals may be visually encountered but not identified. For example, if individual identification is only possible upon capture and individuals escape capture, visual encounters can result in failed captures where individual identities are unknown. In such cases, the data consist of capture histories with known individual identities, and counts of failed captures for individuals with unknown identities. These failed captures are ignored in traditional capture-recapture analyses that require known individual identities. Here we show that if animals can be encountered at most once per sampling occasion, failed captures provide lower bounds on population size that can increase the precision of abundance estimates. Analytical results and simulations indicate that visual encounter data improve abundance estimates when capture probabilities are low, and when there are few repeat surveys. We present a hierarchical Bayesian approach for integrating failed captures and auxiliary encounter data in statistical capture-recapture models. This approach can be integrated with existing capture-recapture models, and may prove particularly useful for hard to capture species in data-limited settings.


INTRODUCTION
Capture-recapture studies are widely used for estimating abundance and demographic rates, using information about the identities of captured individuals (Jolly 1965). This paper examines the case where individual identification requires capture, and identities of animals that are visually encountered but not captured are unknown. In such cases, "capture" is synonymous with "identification." This often applies in capture-recapture studies of amphibians, where individual identification is only possible with the animal in hand. We also assume that if an individual escapes capture, it is not encountered again in the sampling occasion because it is hiding or otherwise inaccessible Nichols 2010, Joseph andKnapp 2018). Under these conditions, encounters leading to failed captures provide information about abundance, but this information is not readily used in traditional capture-recapture models.
When individuals can be encountered at most once on a survey, encounter and capture data both provide lower bounds on the total number of individuals in the population. Total abundance must be greater than or equal to the number of animals encountered in a survey. Similarly, total abundance must be greater than or equal to the number of unique individuals identified in the capture data. Capture data differ however, in that information accumulates over multiple surveys (Pollock 1982). For example, if two surveys occur on consecutive days in a closed population, then the total number of unique individuals captured across both surveys provides a lower bound on abundance.
Here, we show how visual encounter data can improve abundance estimates in capturerecapture studies for study designs where (1) individual identification requires capture, (2) a subset of encountered individuals are captured, and (3) individuals can be encountered at most once per sampling occasion. We develop a modified capture-recapture observation model and investigate conditions under which encounter data improve abundance estimates. The methods presented here accommodate both failed captures and counts of animals collected separately from capture-recapture surveys and can be integrated with existing capture-recapture models.

Model description
We adopt a hierarchical Bayesian approach in which an observation model depends on a state model, and both depend on some parameters (Berliner 1996). The state model describes the presence or absence of individuals in a population, and the observation model describes the visual encounter and capture process. The parameter model represents prior distributions for all remaining unknowns.

State model
Abundance can be estimated from capturerecapture data in a Bayesian framework using parameter-expanded data augmentation (Royle and Dorazio 2012). Here, N * unique individuals are observed, but M > N * individuals are modeled, augmenting the observed data with M À N * additional capture histories of animals that were never captured. The assumption is that the true abundance N is less than M, but greater than the number of observed individuals N * .
Individuals i ¼ 1, :::, M are either "in the population" (z i ¼ 1) or not (z i ¼ 0), where the parameters z 1 , :::,z M are state parameters to be estimated, and abundance is N ¼ ∑ i z i . These states can be modeled as conditionally independent Bernoulli random variables with probability parameter ω, where ω is the probability of an individual being in the population: z i ∼ BernoulliðωÞ:

Observation model
On surveys, k ¼ 1, :::,K an observer searches for individuals, encountering each individual with probability η. We assume individuals can only be encountered once at most. If an animal is encountered, it is captured with probability κ. Because individuals must be captured to be identified, encounter data are observed for captured individuals, but not for failed captures. Thus, encounter histories are partly observed. We assume the number of failed captures on each survey is observed.
Let y * i,k represent the categorical outcome for individual i on survey k. There are three possibilities ( Fig. 1): 1. The individual was not encountered (y * i,k ¼ 1), with probability z i (1−η) + 1-z i . 2. The individual was encountered but not captured (y * i,k ¼ 2), with probability z i η (1-κ). 3. The individual was captured (y * i,k ¼ 3), with probability z i ηκ.
The first two outcomes are not observed. We observe a binary record of whether individual i was captured on survey k: y i,k , so that y i,k ¼ I y * i,k ¼ 3 , where I is an indicator function that is equal to one if the condition inside the parentheses is satisfied, otherwise it equals zero. In other words, y * i,k is observed only if y * i,k ¼ 3. Additionally, the observed number of failed captures f k corresponds to the sum f k ¼ ∑ i I y * i,k ¼ 2 . The observation model consists of two parts: one for the capture data: and another for the failed capture counts: f k jy * 1,k ,:::, y * where square brackets denote a probability function. Alternatively, a "soft" constraint can v www.esajournals.org be imposed as an approximation or to account for uncertainty in the number of failed captures: f k jy * 1,k , :::,y * with σ set to some small fixed value.

Parameter model
The hierarchical model specification is completed by specifying prior distributions for remaining unknowns. Here, we use independent uniform(0, 1) priors for all probabilities. This prior over the inclusion probability ω implies a discrete uniform prior for the true abundance from 0 to M.

Auxiliary encounter data
In some cases, auxiliary encounter data are collected, such as during visual encounter surveys where individuals are counted on surveys, but no captures are attempted (Crump and Scott 1994). Let a k represent the number of unique individuals that were encountered on survey k. If encounters of individuals are conditionally independent, a binomial observation model provides a reasonable choice, where the number of trials is the population abundance N ¼ ∑ i z i and the probability of success is the encounter probability η: for surveys k ¼ 1,:::, K. This is the observation model used in N-mixture models (Royle 2004). In addition to potentially providing a higher lower bound on abundance, auxiliary encounters also provide additional information about encounter probabilities, because the encounter data are conditionally independent from encounters leading to captures. When combined with the capturerecapture model outlined above, the joint model of captures, failed captures, and auxiliary Those that are not are never encountered. Those that are may be encountered on occasion k or not, and encountered individuals may or may not be captured and identified (we assume that identification requires capture, so that capture and identification are synonymous). Each path leads to a value of the partly observed quantity y * i,k , where y * i,k ¼ 1 when animals are not encountered, y * i,k ¼ 2 when animals are encountered but not identified, and y * i,k ¼ 3 when animals are encountered and identified.
v www.esajournals.org encounters comprise an integrated model (Besbeas et al. 2002, Abadi et al. 2010. The posterior distribution is proportional to Abundance lower bounds from encounter and capture data The total population size is bounded from below by the number of animals encountered on any one survey, assuming each animal can be encountered once at most (i.e., individuals are not double-counted). If n k is the number of unique animals encountered on survey k, the lower bound on abundance from encounter data n min ¼ max n 1 ,:::, n K ð Þ is the maximum of K independent binomial random variables with sample size N and probability η. The probability mass function of this lower bound is thus given by Pr n min ¼ n ð Þ¼FðnÞ K À Fðn À 1Þ K where F n ð Þ is the cumulative distribution function of a binomial random variable.
Population size must also be greater than or equal to the number of unique captured individuals. A probability mass function for the lower bound on abundance from capture data (c min : the number of unique captured individuals) can be derived with a binomial distribution. The binomial sample size is the true population size (N), and the probability of success is the probability of being captured one or more times: 1 -(1 -ηκ) k . The probability mass function for c min , the abundance lower bound derived from the capture data, is as follows: When the expected lower bound from encounter data exceeds the expected lower bound from capture data (ðn min Þ > ðc min Þ), encounter data are expected to increase the precision of abundance estimates.

Simulations
We empirically verified our theoretical results about the expected lower bounds on abundance provided by encounter and capture data using Monte Carlo simulation. We generated five replicate encounter-capture-recapture datasets for each parameter combination of N = 10, 50, 100, K = 3, 6, 9, and η and κ ranging from 0.01 to 0.99 in increments of 0.01, resulting in 441,045 unique datasets. For each parameter combination, we computed the empirical mean lower bounds from encounter and capture data, averaging over the five replicate iterations, and compared the results to the theoretical expectations generated from the probability mass functions for n min and c min .
To understand the implications of bounding abundance for other parameters, we used a simulation study with known parameters across a range of repeat surveys (K = 3, K = 6, and K = 9). We visualized the joint posterior distribution of abundance and the probability of being encountered and captured, and compared these results to a simpler model: M 0 -a capture-recapture model of a closed population with identical detection probabilities p 1 ¼ ::: ¼ p K ¼ p, which ignores encounters that do not lead to captures (Royle and Dorazio 2008). This model can only estimate the marginal probability of capture p = ηκ. The observation model is y i,k : Bernoulli z i p ð Þ for individuals i ¼ 1,:::, M on survey k ¼ 1, :::,K. The state model for z is unchanged, and uniform priors over (0, 1) were assigned to ω and p.
Draws from the posterior distributions of all models were simulated using JAGS, with six parallel Markov chain Monte Carlo chains, and 400,000 iterations per chain with an adaptation period of 200,000, a burn-in period of 40,000, and posterior thinning by 400 to reduce memory usage (Plummer et al. 2003). Convergence was assessed using visual inspection of traceplots and the potential scale reduction factor (R) statistic (Gelman and Rubin 1992). All code to replicate the analyses is available in a research compendium at https://github.com/mbjoseph/secmr.

RESULTS
Across a range of abundances, encounter data are expected to provide a higher lower bound on abundance when capture probabilities are low, v www.esajournals.org and when there are few repeat surveys (Fig. 2). The boundary in the bivariate encounter-capture parameter space delineating the region where  c min À n min ð Þ < 0 shifts toward lower capture probabilities as the number of repeat surveys increases, and as abundance increases. Empirical average bounds from simulated capture-recapture data were in agreement with the theoretical expectations derived from the probability mass functions of n min and c min (Fig. 3).
When the lower bound on abundance is greater for encounter data than capture data, the joint model of encounters and captures produces a more precise estimate of abundance because there is zero probability mass below the lower bound on abundance. Furthermore, if there is posterior correlation between abundance and another parameter, encounter data can also increase the precision of the correlated parameter. For example, the marginal probability of capture and abundance are correlated in the posterior, so that increased posterior precision for abundance implies increased posterior precision of marginal capture probability (Fig. 4).

DISCUSSION
Failed captures and auxiliary encounters are likely to be most useful in capture-recapture studies when animals are hard to capture and the number of surveys is small. This expectation holds across a range of population abundance and encounter probabilities. In such cases, encounter data increase the precision of population abundance estimates by increasing the lower bound on abundance. Such data are essentially "free" in encounter-capture-recapture study designs and can be included by modifying the likelihood function (and not the underlying state model) of capture-recapture models.
In addition to increasing the precision of abundance estimates, encounter data can increase the precision of parameter estimates for parameters that are correlated with abundance in the Fig. 2. Expectations for the difference in abundance lower bounds provided by capture and encounter data as a function of the number of surveys K, abundance N, the encounter probability η, and the probability of capture conditional on an encounter κ. When the surface is red, encounter data are expected to increase the precision of abundance estimates by increasing the lower bound on true abundance. The heavy black line marks the null isocline where the expected difference is zero. Lighter lines represent contours spaced by 2 individuals. v www.esajournals.org posterior distribution. For the simple model presented here, this includes the detection and inclusion probabilities. For more complex models that allow state evolution through time, this might include survival and recruitment probabilities. Thus, we expect that encounter data in general might provide information about abundance, and parameters relating to abundance and the measurement process.
These results are consistent with related findings for mark-resight studies where marked individuals are subject to incomplete identification. In such studies, accounting for failed identifications of marked individuals-analogous to failed captures-is most advantageous when identification probabilities are low-analogous to capture probabilities being low (McClintock et al. 2014b). However, in the encounter-capture-recapture scenario considered here, whether an animal is marked or not is unknown until it is captured, as would be the case for subdermal passive integrated transponder tags in an amphibian (Gibbons and Andrews 2004).
Here, we assumed that individuals could be encountered at most once per sampling occasion. In other words, there is no double counting. Each failed capture corresponds to one unique individual. The observation model developed here is not robust to violations of this assumption, because the total number of encounters sets a lower bound on the true population size. As a consequence, if individuals escape capture multiple times in the sampling occasion, it is possible that the posterior for abundance might be misleadingly precise (i.e., the lower bound on population size would be too high). Therefore, we do not recommend this approach if individuals might be encountered multiple times on the same sampling occasion. This assumption is likely to hold, for example, in capture-recapture studies of amphibians in high elevation lakes, where individual animals that escape capture hide afterward, for example, underwater where they cannot be seen or captured (Joseph and Knapp 2018). If individuals are captured multiple times in the same sampling occasion, then it will be clear that the assumption of at most one encounter has been violated. In such cases, alternative encounter models may be necessary, for example, a Poisson model that allows repeat encounters on a sampling occasion (Royle et al. 2009).
The model developed here relates to other approaches for handling imperfect individual identification in capture-recapture studies including misidentification (Link et al. 2010, McClintock et al. 2014a, Schofield and Bonner 2015 and partial identification (Augustine et al. 2018), and also to approaches that account for latent encounter histories with observed summary counts (Chandler and Royle 2013). In particular, this approach might be viewed as an aspatial analog of the spatial model presented in Chandler and Royle (2013) in which a subset of individual identities is available, with a Bernoulli (instead of a Poisson) encounter model, and a stochastic second stage "identification upon encounter" component. This approach can also be seen as a degenerate case of a partial identification model, in which there are no spatial (Augustine et al. 2018) or genetic data (Wright et al. 2009) available to inform the identities of individuals that have escaped capture.
Looking ahead, there are opportunities to build upon this approach. First, in terms of implementation, marginalization over the discrete latent variables might allow more efficient sampling from the posterior distribution. Second, because this model includes separate parameters for encounter and capture probabilities, covariates can be included separately for each of these components. This could be useful, for example, to account for predator avoidance behavior that might influence capture probabilities, and weather conditions that might influence encounter probabilities. Observer effects provide an Each point is a sample from the posterior. Black points correspond to the baseline capture-recapture model M 0 , which does not include encounter data. Blue points correspond to an encounter-capture-recapture model that uses encounter data to bound abundance. Vertical dashed lines are shown for the lower bounds on abundance derived from encounter (n min ) and capture (c min ) data. additional use case: Some observers might be better than others at finding or capturing individual animals.
In this paper, we presented a motivation for including encounter data in capture-recapture studies based on abundance lower bounds from encounter and capture data. Given that encounter data are included via a modified likelihood and not a modified state model, this approach can be readily integrated with a variety of capture-recapture models and may be useful for hard to capture species in data-limited settings.

ACKNOWLEDGMENTS
This work was motivated by years experience in the field capturing (and failing to capture) amphibians in the Sierra Nevada and the Klamath mountains, and funded by a grant from the Yosemite Conservancy. We thank Ben Augustine for helpful comments on an earlier version of the manuscript.