## Introduction

Many estimation methods have been developed for the analysis of closed population capture–recapture data. For comprehensive material on the subject see, for instance, Otis et al. (1978), Seber (2002), Williams et al. (2002) and Amstrup et al. (2005). The most general capture–recapture closed population model, considered by Otis et al. (1978) was denoted by M_{tbh} where (h) is used to denote inherent individual heterogeneity, (t) time effect, and (b) behavioral response to capture. In this work, we are interested in estimating the population size and SE of a submodel of the type M_{h}, where individual heterogeneity can be modeled as a function of covariates. Development of capture–recapture models dealing with individual heterogeneity in capture probabilities has been one of the most challenging tasks. Failure to account for such heterogeneity has long been known to cause substantial bias in population estimates (Otis et al. 1978; Lee and Chao 1994; Hwang and Huggins 2005). Moreover, Link (2003) showed that without strong assumptions on the underlying distribution, estimates of population size under model M_{h} are fundamentally nonidentifiable.

The use of covariates (or auxiliary variables), if available, has been proposed as an alternative way to partially cope with the problem of heterogeneous capture probabilities (Pollock et al. 1984; Huggins 1989; Alho 1990). The idea is to model capture probabilities as a function of individual (i.e., age, sex, and weight) and environmental (i.e., temperature, rainfall, and location) covariates, using a generalized linear modeling (GLM) approach, such as logistic regression. The method of Huggins (1989, 1991), based on a conditional likelihood to estimate population size, has become very popular, but it assumes independence among capture occasions (Huggins and Hwang 2011).

In the analysis of capture–recapture data, Hwang and Huggins (2005) and Zhang (2012) examined the effect of heterogeneity on the estimation of population size by solving estimating equations, but these authors also assumed independence of capture occasions. Capture–recapture data are collected on the same individuals across successive capture occasions. One may view capture–recapture data as binary longitudinal or repeated measurements data (Huggins and Yip 2001). These repeated observations are often correlated over time. This dependency or correlation structure may be induced by incorporating individual heterogeneity. Failure to account for this dependency may provide biased estimates. Hwang and Huggins (2007) also state that the assumption of independence among capture occasions is often violated in practice, but the authors still rely on the assumption. Some dependencies among capture occasions can be dealt with through the modeling of behaviorally effects, such as trap happy and trap shy effects, which are treated as special cases in the capture–recapture literature (Yang and Chao 2005; Pradel and Sanz-Aguilar 2012). One alternative approach is to use a generalized estimating equations (GEE) to account for a working correlation structure among capture occasions (Liang and Zeger 1986) and use observed individual characteristics to model heterogeneity in capture probabilities. A mixed effects modeling approach may also be used to model heterogeneity of individual observed and unobserved characteristics in capture–recapture experiments motivating the use of generalized linear mixed models (GLMM) (Pinheiro and Bates 2000). Some authors have previously introduced the use of GLMM (logit models with normal random effects) (e.g., Coull and Agresti 1999; Stoklosa et al. 2011). An advantage of using GLMM for the estimation of capture probabilities is to accommodate not only the heterogeneity attributed to individual characteristics, but also the heterogeneity that cannot be explained by the observed individual characteristics.

Bayesian methods have also become popular in capture–recapture studies. An extensive Bayesian literature on capture–recapture closed population models includes Castledine (1981), Smith (1991), George and Robert (1992), Madigan and York (1997), Basu and Ebrahimi (2001), Ghosh and Norris (2005), King and Brooks (2008), and Gosky and Ghosh (2009, 2011). Bayesian statistical modeling requires the development of the likelihood function of the observed data, given a set of parameters, as well as the joint prior distribution of all model parameters. Bayesian methods allow for estimation of the unobserved random effects as well, but the performance of their estimates often depends on the chosen prior distributions. Often, the method of selecting prior distributions is subjective (Lee et al. 2003). A possible advantage of GEE over random-effects models and Bayesian methods relates to the ability of GEE to allow specific correlation structures to be assumed between capture occasions.

Here, we propose a GEE approach for estimating capture probabilities and population size in capture–recapture closed population studies. We also compare the results of population size estimates and their SE, when using the two estimation methodologies (i.e., GEE and GLMM). For illustrative purposes, we analyze a real data set that has already been discussed in the literature. Conditional arguments are used to obtain a Horvitz–Thompson-like estimator for estimating population size. A simulation study is also conducted to compare the performance of the estimation procedures. In the next section, we describe the notation and models that are used to estimate capture probabilities and population size.