A Generalization of Chao's Estimator for Covariate Information



This note generalizes Chao's estimator of population size for closed capture–recapture studies if covariates are available. Chao's estimator was developed under unobserved heterogeneity in which case it represents a lower bound of the population size. If observed heterogeneity is available in form of covariates we show how this information can be used to reduce the bias of Chao's estimator. The key element in this development is the understanding and placement of Chao's estimator in a truncated Poisson likelihood. It is shown that a truncated Poisson likelihood (with log-link) with all counts truncated besides ones and twos is equivalent to a binomial likelihood (with logit-link). This enables the development of a generalized Chao estimator as the estimated, expected value of the frequency of zero counts under a truncated (all counts truncated except ones and twos) Poisson regression model. If the regression model accounts for the heterogeneity entirely, the generalized Chao estimator is asymptotically unbiased. A simulation study illustrates the potential in gain of bias reduction. Comparisons of the generalized Chao estimator with the homogeneous zero-truncated Poisson regression approach are supplied as well. The method is applied to a surveillance study on the completeness of farm submissions in Great Britain.