Bayesian analysis of one‐inflated models for elusive population size estimation

Abstract The identification and treatment of “one‐inflation” in estimating the size of an elusive population has received increasing attention in capture–recapture literature in recent years. The phenomenon occurs when the number of units captured exactly once clearly exceeds the expectation under a baseline count distribution. Ignoring one‐inflation has serious consequences for estimation of the population size, which can be drastically overestimated. In this paper we propose a Bayesian approach for Poisson, geometric, and negative binomial one‐inflated count distributions. Posterior inference for population size will be obtained applying a Gibbs sampler approach. We also provide a Bayesian approach to model selection. We illustrate the proposed methodology with simulated and real data and propose a new application in official statistics to estimate the number of people implicated in the exploitation of prostitution in Italy.

One-inflation can occur for different reasons; for instance, when some units of the population can no longer be captured after the first capture. Such may be the case of some wild animal populations. In fact, animals experiencing a capture may find it so unpleasant that some develop the will and ability to avoid subsequent captures. Much the same mechanism may also occur in human populations, particularly when the first capture is a matter of law enforcement, involves imprisonment, or reveals an undesirable characteristic/behavior. See Godwin and Böhning (2017) for ample discussion of the justifications and conditions for one-inflation in capture-recapture, also including an interpretation of one-inflation as limiting case of the so-called "trap shy" behavioral model; see, for example, p. 37 of McCrea and Morgan (2014) or p. 119 of Borchers et al. (2002). One-inflation deserves specific attention due to its effect on population size estimators. In fact, when not taken into account, one-inflation causes overestimation of the total population size. This also applies to the well-known lower bound Chao estimator, as discussed in Chiu and Chao (2016) and Böhning et al. (2018).
In this paper we propose a Bayesian approach for counting data models with one-inflation. The properties of our models are analyzed with both simulation studies and real data applications. In particular, we apply our models to real data to estimate the size of some illegal populations active in Italy in 2014 and some real data available from the literature on capture-recapture, where the issue of one-inflation has been recognized.
The paper is organized as follows: In Section 2 we introduce the notation for repeated counting data and broadly illustrate Bayesian inference for population size with this kind of data. We describe the general model for one-inflated count data under an unspecified counting distribution and outline a Gibbs sampler algorithm to handle the one-inflated models. We also introduce a formal Bayesian procedure for model comparison in the presence of one-inflated models. Section 3 specifies the results under the Poisson and geometric assumptions, corroborating our proposal with a simulation study. In Section 4 we introduce the negative binomial distribution and its one-inflated counterpart discussing the boundary problem via a simulation study. In Section 5 we illustrate some applications to real cases: First we show the results of our inference on data on prostitution exploitation in Italy in 2014; moreover, we apply our models to some popular data sets in capture-recapture literature. Section 6 concludes the paper with some remarks and discussion of open issues for further investigation.

BAYESIAN INFERENCE FOR POPULATION SIZE
According to the standard formulation, consider a closed population (no births, deaths, or migration) of size . For each unit in the population, let be a random variable taking value = 0, 1, 2, … if the individual is observed/captured times. We only observe the individuals, ≤ , which are captured at least once. Let = ( 1 , … , ) be the vector of the individual number of captures. Note that will denote the result of the capture-recapture experiment, which comprises both the number of captured individuals and the number of captures for each observed individual. Let denote the number of individuals observed times, that is, is the frequency of count in sample . Our interest is to estimate the number of uncaptured units 0 , and, consequently, the total population size = + 0 , on the basis of some model for the observed .
Bayesian inference for the population size can be obtained with standard Markov chain Monte Carlo (MCMC) algorithms. In fact, let ( | ) = ( = | ) for = 0, 1 …, be the probability distribution function for . The generic expression for the likelihood ( | , ) is Assuming independent priors for and , that is, ( , ) = ( ) ( ), the posterior distribution ( , | ) can easily be drawn by, for example, updating the conditional distributions We can generate from those posteriors via Gibbs or Metropolis-Hastings steps, according to the parametric family for and the prior for .
See Tardella (2002), Wang et al. (2007), and Xu et al. (2014) for extensive simulation studies. Note the following: (1) by assuming ( ) ∝ 1∕ , the full conditional distribution of 0 = − is negative binomial with size parameter and probability (0| ) whatever the model for may be; (2) the full conditional of corresponds to its posterior distribution when the zero counts are also known.
For example, when is Poisson( ) and a priori we take the conjugate prior for , which is Gamma( , ), the latter step consists solely in the generation of a Gamma distribution with parameters given by + and + + 0 , where is the sum of the observed captures. Similarly, when is geometric( ) and a priori we take the conjugate prior for , which is Beta( , ), this step consists in the generation of a Beta distribution with parameters + + 0 and + .

One-inflated models
We assume that in our population a specific behavioral mechanism is at work, by virtue of which an individual that would otherwise face multiple captures now has a positive probability of being captured just once. Let denote the observed number of captures for a unit, and * the latent value we would observe without the behavioral mechanism. The two variables are linked by means of the following infinite transition matrix: where the ( , )th element represents the conditional probability ( = − 1 | , * = − 1). When > 1 these conditional probabilities can be written as where ( ) is Kronecker delta. Let ( | ) = ( * = | ) be the probability distribution, depending on a given parameter, , of the number of captures without the behavioral effect, and let ( ) denote the associated c.d.f. Then, the resulting distribution for is the oneinflated model defined as follows: (1 − ) (1| ) + (1 − (0| )) if = 1; The conditional distribution of * when = is concentrated on when ≠ 1, while, when = 1, we have: if > 1. (2)

Gibbs sampler for one-inflated models
Bayesian inference for one-inflated models can be obtained by simulating the posterior distribution of , , , * 1 , … , * given the observed data , where * 1 , … , * indicate the unknown captures that the observed units would have faced without the behavioral mechanism. Let us assume that the parameters , , and are a priori independent and let ( , , ) = ( ) ( ) ( ) denote the prior distribution. The general expression for the posterior distribution of oneinflated models augmented with the vector * = ( * 1 , … , * ) is To describe our approach to simulate the posterior distribution of one-inflated models, we introduce an additional latent binary variable indicating the presence/absence of the behavioral mechanism, which causes the one-inflation in unit , that is, is the indicator function of the event { ≠ * }. We then have that: and, from (2), we have .
Then, since = 1 implies * > 1, we have We can now outline a Gibbs sampler looping over the full conditionals of * and , , and . The updating of will depend on the model assumption for * and may require a Metropolis-within-Gibbs step, whereas the updating of * , , and can always be performed with the following exact Gibbs steps: (1) The simulation of the full conditional of * 1 , … , * can be obtained in two steps, by first updating 1 , … , . In fact, let = ∑ =1 be the number of units affected by one-inflation; then, conditional on the current value of and , we can generate a value for from ) .
Then, for each of the units, we can generate a value of * by simply simulating a number of captures from the truncated count distribution (3).
(2) Consider the prior ∼ ( , ), and let , be the number of units among the for which * = , such that ∑ , = . We can then write the full conditional of , ( | −) as: That is, we can directly draw from .
(3) The full conditional distribution of is given by and, by assuming the improper prior ( ) ∝ 1∕ we can directly draw 0 from the following negative binomial If we adopt a different prior over , we have to implement a Metropolis step.
Finally, as we have seen, the updating of will depend on the model assumption for * . The general expression for the full conditional of is:

Model selection
To test the one-inflation assumption with respect to a specific base count distribution we can adopt a fully Bayesian approach. Let 1 be the noninflated model and 2 the one-inflated counterpart (indicated by the OI suffix, hereafter). Model comparison can be performed by calculating the posterior model probabilities where ( | ) is the marginal likelihood that, for the models considered in this paper, can be generally written as with 1 and 2 denoting, respectively, the parameters of the baseline and the OI counterpart models. For instance, for Poisson model we have 1 = and 2 = ( , ), for the geometric case we have 1 = and 2 = ( , ). In the case of two models we can directly use the Bayes factor (BF) in favor of the OI Note that we can also extend the comparison setting by simultaneously considering more than two models. For example, in the next section we compare the Poisson and the geometric model together with the corresponding OI counterparts for a total of four models. Assuming equal prior probabilities ( ) for = 1, … , , the posterior model probabilities are proportional to the marginal likelihoods, that is, ( | ) ∝ ( | ) for = 1, … , . Note. moreover, that assuming the noninformative prior ( ) = ∕ would produce marginal likelihoods depending on the constant . However, in our case, the parameter has the same meaning across all the models under comparison, hence the use of the same improper prior ( ) = ∕ is justified and the constant cancels out in the evaluation of the posterior model probabilities, see Kass and Raftery (1995).
Analytical evaluation of the marginal likelihoods ( | ) is not possible. However, we have that (see the Appendix) Hence, the posterior model probabilities will depend solely on fitting the truncated distribution of to the observed captures.
To evaluate the marginal likelihood of each model numerically, we use the Chib's approximation introduced in Chib (1995), which can easily be obtained as a by-product of the general Gibbs algorithm illustrated in the previous section. The details of the Chib approximation for all the models considered throughout this paper are given in the Appendix.
Finally, it is worth noting that, in the context of capture-recapture, model averaging appears to be a suitable alternative to model selection. In fact, the quantity of interest has the same meaning across different models and we can easily obtain an estimate of averaged over the eligible alternatives via the following formula: whereˆis the posterior mean of obtained under model . However, since the estimates of under the base model and under its one-inflated counterpart may show very considerable differences, definite choice between the two could be a sensible approach in this case.

ONE-INFLATED POISSON AND GEOMETRIC DISTRIBUTIONS
If we assume that our count data * follows a Poisson distribution, that is, ( ) represents a Poisson density with parameter , the model proposed for the observed in previous Section 2.1 is an OIP and corresponds to the model presented in Godwin and Böhning (2017). The estimating procedure is based on the Gibbs sampler described in Section 2.1, where, in order to complete the analysis framework, we assume a Gamma( , ) prior for , , and being shape and rate parameters. Let * be the total number of units captured times after updating 0 , , and * , that is, * = and let { * } denote the set of all values * for = 0, 1, … We can then generate the updated value for from its full conditional ( . If we adopt a geometric distribution for * , parameterized as the resulting model for is called one-inflated geometric (OIG). To finalize the Bayesian analysis, we adopt a ( , ) conjugate prior for , and its posterior conditional on the current values of 0 , , and * would be equal to: .

A simulation study
In this section we present a twofold simulation study; on one hand, we aim to validate our proposal for inference on the population size in the presence of one-inflation, while on the other hand the results of the simulation study illustrate the model selection among the four models presented in the previous section, namely, Poisson (which we refer to as model Poi), Geometric (Geo), OIP, and OIG. Specifically, we set up three main scenarios: In the first we generate from the base distributions without one-inflation; in the second scenario, we generate from one-inflated distributions with a low/moderate inflation rate ( = 0.2), while in the third we consider a substantial inflation rate ( = 0.5). We repeat each scenario with two different values of the parameter ( or ) and with two different values of (500 and 1000). We set the parameters using values similar to those from the real cases analyzed in Section 5. The scenarios and the values of the different parameters are summarized in Table 1.
For each combination of parameters in each scenario we simulate 100 data sets of units from the respective generating model and remove the 0 counts from the sample. To simulate from the one-inflated models in Scenarios II and III, we generate from the corresponding base model and then change each generated value greater than 1 to a 1 with probability . All the experiments were conducted in R and the code is available as Supporting Information on the journal's web page. First, we set out to evaluate the sensitiveness of the estimates of the unobserved population size 0 under mispecification of the model. For each simulated data set, we consider the estimates of 0 , given by the posterior mean, under all four models, and compute relative bias calculated as the relative difference between the true value and the posterior mean of the parameter. Table 2 shows the average percentage relative bias over the 100 replicates.
The results set out in Table 2 confirm that the estimates of 0 we obtain with a one-inflated model are always lower than those obtained with the corresponding base model. In fact, ignoring one-inflation when present leads to severe and systematic overestimate of 0 . On the other hand, admitting one-inflation when it is not present is not such a serious error and, on average, we moderately underestimate 0 . Choosing the wrong model (Poisson instead of geometric, inflated or not) can have disastrous consequences. In particular, if data come from Poi or OIP models, a Geo or OIG model would drastically overestimate 0 . If data are generated from a Geo or OIG model, choosing a Poi or OIP model implies an equivalent underestimate of 0 . Note that, the two cases having the highest relative bias under the correct models can be justified by the observed number of captures. In particular, when the generating model is Here we will not present the results concerning the relative root mean squared error and the relative mean absolute error, which in any case, confirm the results presented on the relative bias.
These results are also confirmed on analyzing the coverage of the posterior credible intervals, not reported here for brevity but computed by the R code available in the Supporting Information on the journal's web page. The posterior credible intervals of the one-inflated model almost always contain the true values when we generate from the corresponding baseline distribution. On the other hand, when we generate from a one-inflated model, the credible intervals of the baseline model barely cover the true values. The credible intervals deriving from the Poisson models (regardless of oneinflation) seldom cover the true value generated by the geometric distribution, and vice versa. The only exception is the case in which we generate from OIG ( = 0.6, = 0.5) and estimate with a Poisson distribution (see the bottom row in Table 2), in which case the baseline Poisson credible intervals cover the true value nearly 50% of the times.
Next, to assess the model selection criterion detailed in the previous section, Figures 1 and 2 show the posterior probabilities of our four competing models calculated with Chib's approximation. Figure 1 summarizes the results in all the scenarios when = 500, while Figure 2 refers to the case = 1000.
It is evident that, as the number of observed units increases, the effectiveness of the posterior model probabilities in identifying the correct generating model is reinforced. Note that depends both on and on the parameters and . It is also evident that a higher inflation rate will be more easily identified correctly. In fact, when = 1000, we would select the true data-generating model in almost all simulations in Scenarios I and III, and in most cases in Scenario II. For the sake of brevity, here we do not present the results when = 2000 or higher, since in all scenarios and parameter combinations the posterior model probability of the generating model is close to one. When = 500, we would still identify the correct generating model in the majority of cases, but we can observe some critical situations. In particular, when the generating model is OIP with = 1 and = 0.2, and when we generate from the OIG with = 0.6 and = 0.2, the correct model and its base counterpart are almost equally preferable. In the former case we have = 316 and 1 = 183 on average, that is, most of the units are captured once. Consequently, the posterior probabilities are very similar due to such a slight alteration in singleton counts from the basic Poisson distribution. Much the same happens in the latter case, with an even lower number of observations (on average = 200).
For a simulation study using frequentist criteria for model selection (Akaike information criterion [AIC] and Bayesian information criterion [BIC]) see Böhning and Ogden (2021).
In conclusion, as expected, the one-inflation models encompass the baseline models and, when one-inflation is not present, the slight underestimation of decreases as increases. Clearly, the choice of the distribution is a crucial aspect, and the Bayesian approach gives us a powerful tool to deal with model selection. F I G U R E 1 Box-plot of posterior model probabilities when = 500; the data-generating model is indicated above each panel

ONE-INFLATED NEGATIVE BINOMIAL
In this section we describe how to perform Bayesian estimation of the population size in the presence of one-inflation when the base distribution is the negative binomial model. We also underline the inferential drawbacks related to this distribution, which limit its general use and how the Bayesian approach mitigates these problems. The negative binomial distribution (NB) is often adopted as a two-parameter generalization of Poisson that can take into account overdispersed count data. It also constitutes a generalization of the geometric distribution, with respect to which it allows for both overdispersion and underdispersion. Its use is well known in capture-recapture, and has also been investigated in the presence of one-inflation in Godwin (2017).
Here we assume that the unobserved count * follows an NB model with the following parameterization in terms of and :

F I G U R E 2
Box-plot of posterior model probabilities when = 1000; the data-generating model is indicated above each panel and we will call the resulting model for one-inflated negative binomial (OINB). In our Bayesian approach, we set two independent priors on the parameters and . For we take a ( , ) prior, while for we compare Gamma and Inverse Gamma priors in order to evaluate the different tail behavior of these distributions on the posterior summaries.
The Gibbs sampler we developed follows the same passages presented in Section 2.1, where ( ) takes the form (5). Recall that * represents the number of units captured times after updating 0 , and * . Then, generating from the full conditional of presents no difficulties, as it turns out to be: .
To update , we compare two different approaches: a Gaussian random-walk Metropolis-Hastings step and the two-stage Gibbs sampler proposed by Zhou and Carin (2015). Note also that the presence of a Metropolis step does not preclude calculation of the marginal likelihood ( | ) with Chib's approximation for the negative binomial model and for the corresponding OI counterpart, as illustrated in Chib and Jeliazkov (2001). The Appendix provides details of the marginal likelihood approximation for these models.

Metropolis-Hastings
The full conditional of results in: If we consider a Gaussian random walk Metropolis-Hastings, we accept a proposed value ′ with probability equal to the minimum between 1 and Zhou and Carin (2015) exploit the representation of the negative binomial as a compound Poisson distribution, introduced by Quenouille (1949):
They found the explicit distribution of the full conditional of to be the Chinese Restaurant Table (CRT) distribution with concentration parameter . The two Gibbs steps are then: (1) We sample the latent counts, , associated with each observed count * , which can be generated as: ) .
(2) We sample from its full conditional which, given the conjugacy between the Gamma prior for and the Poisson distribution, results in .
Note that, since the total number of captures is often in the order of thousands, and in (6) we are only interested in generating the sum of the , we can simply adopt a Gaussian approximation in the first step. That is, .

Boundary problem
The use of the NB in capture-recapture is limited by the so called "boundary problem," see, for example, Böhning (2015). That is, when the estimate of approaches zero, the Horvitz-Thompson estimation of the population size diverges. More generally, when in the observed (truncated) data the mean number of captures is close to one (which is typically the case in the presence of one-inflation), the NB model severely overestimates , sometimes by several orders of magnitudes, even in simulated data generated by the NB itself. As pointed out in Godwin (2017), taking into account one-inflation alleviates this phenomenon, but does not completely avoid it. We can confirm that, even in our Bayesian approach to the OINB model, we come up against the boundary problem. In general, we noted a great sensitivity of estimates of to small differences in the value of parameter , particularly when < 1, and, accordingly, a great sensitivity of the estimates to specification of the prior distribution over . We see this phenomenon as an opportunity to investigate the usefulness of the Bayesian approach in further alleviating the boundary problem under the OINB. To this end, we conduct a simulation study to assess the effect of different prior specifications on the parameter . We generate 100 replications of random values drawn from an OINB with parameters = 0.35, = 0.5, and = 0.5, and we go on to test two values for , 5000 and 500. The observed sample size varies at each replication; its expected value over the 100 replications is 2040, and 204 when = 5000 and = 500, respectively. The values of these parameters are comparable to the values studied in Godwin (2017), in the frequentist setting, and they allow us to mimic some real cases analyzed in Section 5. All the experiments were conducted in R; the code is available as Supporting Information on the journal's web page.
We test some prior specifications on the parameter, considering both the Gamma and the Inverse Gamma distributions. For estimation of , we apply both the Metropolis-Hasting step and the two-stage Gibbs sampler proposed by Zhou and Carin (2015), observing negligible differences in the results. The outcomes presented in this section are obtained using the Metropolis-Hasting approach. Finally, we compare the results with the maximum likelihood estimates for the OINB. Table 3 shows the percentage relative bias and the percentage mean squared error (MSE) of the population size estimates, considering the difference between the true value and the mean of the posterior distribution obtained by the MCMC simulations. Table 3 also gives the number of cases, in percentage, where we encountered the boundary problem. In fact, we can define the boundary problem on botĥand̂. We adopt the following convention: On̂, we set the boundary problem if̂< 0.25, while on̂, this is the case if̂> 5 . Finally, Table 3 presents the results of the maximum likelihood approach (MLE), obtained using the model proposed by Godwin (2017) and the R code provided by him as Supporting Information.
The Bayesian procedure implements the algorithm described in Section 4.1, setting the number of replications of the MCMC algorithm to 2 ⋅ 10 6 . We set, a priori, ( ) ∝ 1∕ , and (1, 1) for both and . From Table 3, it can be seen that a weakly informative prior specification for , like (1, 1) can already help reduce the boundary problem, when compared to the MLE approach. The boundary problem can be yet further limited using the Inverse Gamma as prior distribution for . In the simulation, the Inverse Gamma prior has the double advantage of reducing both the boundary problem and the MSE of the estimates, at the cost of introducing a negative bias (underestimation) of the population size , which is more severe for small s. Note that we used the convention of defining the occurrence of the boundary problem when̂< 0.25, while in Godwin (2017) the boundary problem is fixed at̂< 0.05. We believe that̂< 0.25 already suffices to indicate the presence of this phenomenon since, as clearly emerges from Table 3, it corresponds approximately to an estimate of 5 times larger than its true value.
To further illustrate the performance of the NB and the OINB, with and without the boundary problem, we compare them with the models considered in Section 3 via a simulation study. In particular, we generate values from the NB with parameters = 5000, = 0.35, and from the OINB with parameters = 5000, = 0.35, and = 0.5, under different scenarios for the size parameter . For each scenario we generate 100 data sets and calculate the estimates of given by the posterior mean, under the six models: Poisson, geometric, negative binomial, and their one-inflated counterparts.  Table 4 shows the average percentage relative bias and relative MSE over the 100 replicates. As we have said, the value of the parameter appears to be crucial in identifying the boundary problem for the NB model, and, under the OINB model, , too, has a clear role. As a consequence, the critical values for differ under the two models. In our data generated from the NB, with the aforementioned values for and , we start to observe a substantial instability in the estimates when = 0.25, and the sheer overestimation of from the NB itself appears clearly in all simulations when = 0.1 (not showed in the table). When we generate from the OINB, estimates derived from the OINB itself start to show the same problem when = 0.5.
We can see in Table 4 that, in the absence of the boundary problem, ( = 1.5 in both cases), the results confirm that the two models can be safely utilized if their respective model assumptions hold; in fact, they perform better than all other competing models. As already observed in Section 3, admitting one-inflation when it is not present leads to moderate underestimation, while ignoring one-inflation when present causes severe overestimation of . In fact, in all cases, the NB overestimates by several orders of magnitude with data generated from the OINB.
A counterintuitive case is given by the data generated from the OINB with = 0.5, in which case the OINB itself results as the second best model, the best being the noninflated geometric. The explanation we gave to this result is the following: The geometric model ignores one-inflation, and this fact should lead to an overestimation of , but at the same time, it fixes the parameter to 1, which is higher than the actual parameter of the generating model ( = 0.5), and this fact should imply an underestimation of . Apparently, in our simulation, these two factors balance each other, giving the geometric a better performance than the OIG and the OINB itself.
In conclusion, when the model hypothesis are met, and the boundary problem is absent or not too serious, for values of greater than 0.25 under the NB, and greater than 0.5 under the OINB, the use of an Inverse Gamma prior may alleviate the phenomenon. However, when the problem is evident, we advise against the use of the two models.

RESULTS ON ESTIMATING ILLEGAL POPULATIONS
Illegal activities are by their very nature difficult to measure because the people involved have obvious reasons to hide them. In this section, we apply our models to estimate the number of people implicated in the exploitation of prostitution, in Italy in 2014. In addition, in Section 5.1 we illustrate the results obtained on some well-known data sets in capturerecapture literature. In Italy, prostitution is neither prosecuted nor regulated, but trafficking, exploitation, and aiding and abetting of prostitution are crimes subject to legal sanctions. These activities are mostly under the control of organized crime. In this study we exploit administrative records from the Ministry of Justice, which report complaints for which the judicial authority has initiated criminal proceedings.
On the basis of soft identifiers (date, country of birth, and gender), the perpetrators can be identified and followed over a given time span, which is 1 year in this application. In this way, the administrative source can be viewed as listing potential exploiters of prostitution and we can observe the number of times an individual is charged. Obviously, we cannot observe the units not captured by the Justice system. We aim to estimate the hidden part of the population, that is, the size of those unreported to the Public Prosecutor's offices. Capture-recapture models have already been used to investigate prostitution and sex workers; see, for instance, Rossmo and Routledge (1990), which estimates the number of street prostitutes in 1986/1987in Vancouver, and Roberts Jr and Brewer (2006, which estimates the number of their clients. In this paper, we aim to estimate the size of prostitution exploiters, rather than the number of prostitutes or their clients. Our data on prostitution exploiters refer to perpetrators of adult sexual exploitation, according to the international classification ICCS (UNODC, 2015) ; these crimes include recruiting, enticing, or procuring a person into prostitution; pimping; keeping, managing or knowingly financing a brothel; knowingly letting or renting a building or other place for the purpose of the prostitution of others. Figure 3 depicts our data. The total number of observed prostitution exploiters is = 2740, the "one" counts are 1 = 2269. Counts greater than 5 are relatively few; 12 is the maximum number of observed captures.
We compared all three basic models analyzed in this paper and their one-inflated counterparts on these data. In all one-inflated models we set a uniform ∼ (1, 1). We set ∼ (1, 1) in the geometric and OIG models, and ∼ (0.01, 0.01) in the Poisson and OIP. Different values for the Gamma prior were also tested, obtaining very similar results. As for the negative binomial, the boundary problem emerged clearly, as, when adopting a (0.1, 0.1) prior for , we obtained a posterior mean for 20 times greater than any other model (498,000). For this reason, we opted for an (0.1, 0.1), both on the NB and the OINB models. In all cases, the number of replications of the MCMC algorithm is set to 10 6 with a thinning of 20 observations. As priors over , we tried both Rissanen's and the improper  ( ) ∝ 1∕ . The two alternatives gave almost identical results. Standard diagnostic tools confirmed the convergence of the algorithms.
The results are summarized in Table 5 and in Figure 4. Figure 4 shows the estimated posterior distributions of 0 and of the parameters of the one-inflated models. The regular shape of the posterior distributions is evident from Figure 4, so the differences in adopting the posterior mode, median, or mean are quite negligible. Regularity of the posterior distributions was consistently observed in all the applications and simulations presented in this paper. Regularity of the posterior distributions does not hold for the 0 and the of the OINB model, due to the boundary problem.
In the upper part of Table 5 we give the estimates deriving from the Poisson, geometric, and negative binomial that ignore one-inflation and compare them to Chao and Zelterman estimators. In the lower part of the table, we give the results from the one-inflated counterparts of the 3 models and compare them to the modified Chao estimators, as suggested in Böhning et al. (2018). This estimator depends on the baseline distribution; we evaluate it assuming both Poisson and geometric distribution with one-inflation (Mod.Chao.OIP and Mod.Chao.OIG, respectively), as in Böhning and Ogden (2021).
In Figure 3, the presence of one-inflation seems likely, and is, in fact, largely confirmed by the test introduced in Section 2.3. Both the OIP and the OIG have posterior probabilities several orders of magnitudes greater than the Poisson and the geometric. The log marginal likelihoods are: −1863.39 (Poi), −1756.23 (Geo), −1718.21 (OIG), −1761.95 (OIP). The OINB model was found to have by far the highest log marginal likelihood, namely −1712.25. However, we believe that caution should be used in adopting the estimates from the OINB. In fact, the boundary problem seems evident (̂= 0.2), and the uncertainty contained in the estimate of 0 is excessive (the width of the interval estimates is about 25 times greater than the total number of observed units).
As expected, if we ignore one-inflation, we risk severely overestimating the population size. Geometric and negative binomial distributions account for heterogeneity and produce much larger estimates than the Poisson distribution.

Results from some popular case studies
In this section, we apply the Bayesian model to a selection of well-known cases popular in the capture-recapture literature. We consider the following real cases: 1 Street prostitutes in Vancouver: The data show the count of prostitution arrests made by the Vancouver Police Department Vice Squad for engaging in prostitution in 1986/1987, initially presented and analyzed by Rossmo and Routledge (1990); 2 Opiate users in Rotterdam: The data show the number of applications for a methadone treatment program made by opiate users in Rotterdam in 1994, first reported and analyzed by Cruyff and van der Heijden (2008); 3 Heroin users in Bangkok: The data provide the counts of treatment episodes by heroin users in Bangkok in 2002, available in Viwatwongkasem et al. (2008) and previously analyzed by Böhning et al. (2004). The observed count distribution of the three real cases are shown in Table 6. In the Vancouver prostitutes data set, we observe = 886 individuals and the number of units captured once is 1 = 541. The Rotterdam opiate-user data set contains = 2029 units and 1 = 1206. The Bangkok heroin-user data set provides = 9302 observations with 1 = 2176.
These data sets have been widely examined in capture-recapture literature, also under the one-inflation hypothesis, see Godwin and Böhning (2017) and Godwin (2017).
We apply our models to the three case-studies, with the following prior settings: For the Poisson and OIP models we set, a priori, ∼ (1, 1) and ∼ (0.1, 0.1). In the OINB model we set ∼ (0.1, 0.1) and ∼ (1, 1). In all our applications, the number of replications of the MCMC algorithm is 10 6 with a thinning of 20 observations. Standard diagnostic tools confirmed the convergence of the algorithm. The results for all three data sets are summarized in Table 7, which shows the posterior modes and credible intervals of , and the posterior means of the model parameters.
The presence of one-inflation in these data sets is less severe than in the prostitution exploitation data analyzed in the previous section. However, as expected, estimates from the base distributions are consistently greater than the corresponding one-inflated estimates, confirming that we might be overestimating the population size if we ignore one-inflation.
For the Vancouver prostitute data, our model selection strategy strongly suggests the OINB distribution, its posterior probability being several orders of magnitudes greater than the competing models. The inflation rate is estimated around 0.40. The base negative binomial encounters the boundary problem, as is clear from the estimate and even more from the credible intervals for . OINB and OIP models produce similar estimates for , with the credible intervals mostly overlapping (the 95%HPD under OINB is slightly greater than under OIP), while the OIG's credible interval barely overlaps the others.
As for the Rotterdam opiate-user data, Bayesian model selection largely favors the geometric distribution, with a posterior probability of 0.89, against 0.104 and 0.006 for OIG and OINB, respectively; the Poisson models posterior probabilities being negligible, both the baseline and the one-inflated. In this case, the one-inflation does not seem to affect the data.
The posterior model probabilities for Bangkok heroin-user data favor the OINB model, even though the estimated inflation rate is quite low, a mere 0.056. The boundary problem is not an issue with this data set, since the estimate of is rather greater than 1.
In all cases, the OINB model produces estimates for higher than the OIP and lower than OIG. Also the one-inflation rate estimates under the OINB model prove always lower than the estimates obtained from the OIP model and higher than those from the OIG. It appears that by using the OINB, part of the one-inflation component identified by the OIP is instead explained through the two parameters of the negative binomial. The credible intervals of the OIP are consistently smaller than those of the competing models, and barely overlap, with the exception of Vancouver prostitute data, where actually the OINB model tends to the OIP one (note the high estimates for the parameter ).
The results in Table 7 can be compared with non Bayesian results reported in Godwin and Böhning (2017) and Godwin (2017), for the OIP and negative binomial models. We note that the use of weakly informative priors leads to results that are close to the frequentist approach. Moreover, the results from our Bayesian model selection strategy are also confirmed by likelihood ratio tests proposed in Godwin (2017), even if likelihood ratio tests provide less strong evidence than our results.

CONCLUDING REMARKS AND FUTURE WORKS
In this paper we have dealt with the issue of one-inflation on repeated count data in population size estimation, adopting a fully Bayesian approach. We discussed our model for one-inflation under an unspecified count distribution, describing TA B L E 7 The posterior mode and credible intervals for the population size , posterior mean for , and model parameters, for real cases a general Gibbs sampler. Specifically, we derived the conditional distributions of the model parameters under the Poisson and geometric assumption; moreover, to deal with data that show overdispersion, we also illustrated the Bayesian analysis for the negative binomial model. We considered the boundary problem of the negative binomial distribution; in the Bayesian setting the prior parameter specification might help alleviate it. A fully Bayesian model selection approach, which includes testing for the one-inflation assumption, was developed for all the distributions considered in the paper. Alongside the usual advantages of a Bayesian approach, namely, the possibility of incorporating any prior knowledge in the analysis and ease in producing interval estimates of any quantity as a by-product of the estimation procedure, we recognize a less obvious point in favor. In fact, although, admittedly, it is not common to have prior information on the quantities at hand, even weakly informative priors can have a positive impact on the analysis. As we saw in Section 4.3, the use of a weakly informative prior when using a negative binomial model or its one-inflated counterpart can help stabilize the estimation procedure and avoid the "boundary problem" in case of moderate severity. On the other hand, the choice of the prior distribution for the size parameter of the negative binomial may affect model selection procedures, which require additional investigation in order to allow a more general use of such distribution in capture-recapture models.
We are currently working on extensions of the current model to cope with observed and unobserved heterogeneity in the presence of one-inflation, exploiting individual covariates, and introducing more complex hierarchical structures and mixing models.
Moreover, we are considering the possibility of taking model uncertainty into account with a model averaging technique in a single procedure by exploiting the reversible jump algorithm (see Green, 1995).
In addition, when dealing with sensible data, like the prostitution exploitation data, which do not share a unique identifier, we may encounter record linkage problems. In this case, it would be important also to take into account the record linkage process uncertainty in population size estimation; see Tancredi and Liseo (2011). Note also that linkage errors can themselves produce one-inflation. In fact, when matching information does not suffice to recognize multiple captures of the same individual, the resulting missing links erroneously increase the number of singletons. However, it is worth nothing that, unlike the case with the framework considered in this paper, linkage errors also affect the observed sample size .
Finally, we are investigating more general behavioral mechanisms producing different forms of inflation. For example, we could assume that when the latent count * is equal to , instead of necessarily having an observation equal to 1 or to the true value , we have that follows a mixture of two distributions. In particular we may have a mixture component with weight 1 − concentrated on the latent value * = . The other component with weight may have support on the set {1, … , } and can, for example, be a Binomial( , ) truncated on 0. Thus, when = 0 we have exactly the form of inflation discussed in this paper while when > 0 the model also allows us to inflate counts greater than one, generalizing the effects of the behavioral mechanism.

A C K N O W L E D G M E N T S
Open Access Funding provided by Universita degli Studi di Roma La Sapienza within the CRUI-CARE Agreement.

C O N F L I C T O F I N T E R E S T
The authors have declared no conflict of interest.

D ATA AVA I L A B I L I T Y S TAT E M E N T
Data sharing not applicable -no new data generated.

O P E N R E S E A R C H B A D G E S
This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available in the Supporting Information section. This article has earned an open data badge "Reproducible Research" for making publicly available the code necessary to reproduce the reported results. The results reported in this article could fully be reproduced.

R E F E R E N C E S
valid for each point ( , ). To approximate the marginal likelihood we may select a point (̃,̃) given, for example, by the posterior means obtained with a first run of the Gibbs sampler and then estimate the value of the posterior (̃,̃| , ) via a second run by using the following strategies.