Summary
 Top of page
 Summary
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 Supporting Information
1. Population assessment in changing environments is challenging because factors governing abundance may also affect detectability and thus bias observed counts. We describe a hierarchical modelling framework for estimating abundance corrected for detectability in metapopulation designs, where observations of ’individuals’ (e.g. territories) are replicated in space and time. We consider two classes of models; first, we regard the data as independent binomial counts and model abundance and detectability based on a productbinomial likelihood. Secondly, we use the more complex detection–nondetection data for each territory to form encounter history frequencies, and analyse the resulting multinomial/Poisson hierarchical model. Importantly, we extend both models to directly estimate population trends over multiple years. Our models correct for any time trends in detectability when assessing population trends in abundance.
2. We illustrate both models for a farmland and a woodland bird species, skylark Alauda arvensis and willow tit Parus montanus, by applying them to Swiss BBS data, where 268 1 km^{2} quadrats were surveyed two to three times during 1999–2003. We fit binomial and multinomial mixture models where log(abundance) depended on year, elevation, forest cover and transect route length, and logit(detection) on year, season and search effort.
3. Parameter estimates were very similar between models with confidence intervals overlapping for most parameters. Trend estimates were similar for skylark (−0.074 ± 0.041 vs. −0.047 ± 0.019) and willow tit (0.044 ± 0.046 vs. 0.047 ± 0.018). As expected, the multinomial model gave more precise estimates, but also yielded lower abundance estimates for the skylark. This may be due to effects of territory misclassification (lumping error), which do not affect the binomial model.
4. Both models appear useful for estimating abundance and population trends free from distortions by detectability in metapopulation designs with temporally replicated observations. The ability to obtain estimates of abundance and population trends that are unbiased with respect to any time trends in detectability ought to be a strong motivation for the collection of replicate observation data.
Introduction
 Top of page
 Summary
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 Supporting Information
Detection and explanation of spatial and temporal patterns in abundance lie at the heart of ecology (Krebs 2001) as well as of its applications such as conservation biology, pest management or monitoring science (Caughley 1994; Norris 2004). Abundance N (also called local abundance N_{i} at site i) is the key state variable for describing populations, but it can hardly ever be measured without error due to imperfect detection probability p– in most situations, some individuals will be missed, that is, detectability p < 1. Hence, simple counts C are not equivalent to N but are related to abundance by the wellknown relationship E(C) = Np (Williams, Nichols & Conroy 2002); they are only indices to abundance with the expected value E of a count being a proportion p of N. Absent double counting, simple counts almost always underestimate abundance. In addition, spatial and temporal patterns in simple counts will be due to both patterns in abundance and patterns in detection (Kéry 2008). Hence, when unbiased estimates of abundance are required or when population trends need to be assessed free from possible distorting patterns in detectability, abundance must be estimated separately from detectability (MacKenzie & Kendall 2002; Kéry & Schmid 2004; Kéry & Schmidt 2008).
Over the last decades, an armada of protocols and associated statistical models have been developed to ‘adjust’ simple counts by an estimate of detection probability and thus arrive at an estimate of abundance. Examples include distance sampling (Buckland et al. 2001) and a large array of capture–recapture protocols (Borchers, Buckland & Zucchini 2002; Williams et al. 2002). Distance sampling uses the distribution of detection distances to provide information about detection probability, while capture–recapture uses the pattern of detection/nondetection over replicated surveys from a period during which a population can be assumed static or closed. In both frameworks, the estimate of detection probability provides the direct link between the observed count and the estimate of population size.
Much of ecology and its applications is concerned with comparisons of abundance in space, and consequently abundance is frequently assessed at multiple sites and using similar protocols at each. This, in essence, represents a metapopulation design (Royle 2004b). Analysis of abundance and detection on a sitebysite basis would be highly inefficient or sometimes impossible due to locally small sample size or even zero counts. Instead, an integrated analysis is required for the most efficient use of available information and to directly model patterns of abundance.
Recently, hierarchical models have been developed for inference about abundance and detection that explicitly account for a metapopulation design (Royle & Dorazio 2006, 2008; also see Borchers et al. 2002). These models have one stochastic component to describe spatial and possibly temporal variation in abundance and another stochastic component that specifies the stochastic outcome of the observation process. The beauty of these models is that, conceptually, they provide a truly mechanistic rendering of how counts of organisms arise as a result of two linked stochastic processes, an ecological process and a dependent observation process. Different descriptions of these two processes can simply be combined in a modular fashion as needed for the particular study at hand. This means that a large variety of animal sampling protocols can all be subsumed into a generic hierarchical model by simply using different stochastic descriptions of the observation process (Royle 2004b). Examples include distance sampling (Royle, Dawson & Bates 2004), point counts (Royle 2004a), removal sampling (Dorazio, Jelks & Jordan 2005), detection–nondetection (a.k.a. ‘presence–absence’) sampling (MacKenzie et al. 2002; Royle & Nichols 2003; Dorazio 2007; Royle & Kéry 2007) and capture–recapture proper (Royle et al. 2007; Webster, Pollock & Simons 2008).
In this article, we compare two classes of hierarchical models that are particularly useful for inference about abundance in metapopulation designs, that is, when observations of ‘individuals’ are replicated in space and time; the binomial (Royle 2004a) and the multinomial mixture model (Royle et al. 2007). ‘Individuals’ may be any individually recognizable units, such as individual animals or plants, breeding pairs or territories, or even species (Kéry 2009). Here, we use territorymapping data from the Swiss Breeding bird survey MHB (Schmid, Zbinden & Keller 2004) for two species; hence, individuals represent individual territories. First, in the binomial mixture model, we regard the data as independent binomial counts and inference is based on a productbinomial/Poisson hierarchical model. Secondly, we use the more complex detection–nondetection data for each territory to form encounter history frequencies for each site, and our analysis is based on a multinomial/Poisson hierarchical model. As the data for the former are just an aggregated form of the more detailed data format used in the latter, we expect very similar inferences under these two models. We hypothesize better precision for the multinomial model because it uses a more detailed (i.e. informationrich) format of the data. However, the data collection assumptions are somewhat more strict as we describe subsequently.
Importantly, in our comparison of the binomial and the multinomial mixture models, we extend both models to directly estimate population trends over multiple years (see also Royle & Dorazio 2008, pp. 4–7 and Kéry et al. 2009 for similar models). That is, our models enable one to estimate population trends corrected for any (parallel) time trends that might exist in detectability. We believe that the ability to directly model population dynamics (here, a simple loglinear population trend) embedded within a framework that fully accounts for the observation process (here, imperfect detection) will be of great value to monitoring and ecological studies alike.
Results
 Top of page
 Summary
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 Supporting Information
For both species, estimates of key parameters were similar under the binomial and the multinomial mixture models (Tables 1 and 2). In particular, trend estimates (±SE) were very similar for the skylark (−0.074 ± 0.041 vs. −0.047 ± 0.019, for the binomial and the multinomial mixture model respectively) and the willow tit (0.044 ± 0.046 vs. 0.047 ± 0.018). For most parameter estimates, and especially for the trend estimates, credible intervals under the two models overlapped. Sole exceptions were four annual detection intercepts in the skylark, where estimates were higher under the multinomial than under the binomial mixture model (Table 1). There was a slight tendency of the same pattern also in the willow tit (Table 2). As a consequence, the multinomial model yielded lower estimates of abundance than did the binomial mixture model (Fig. 1a,b). (Note that predictions for response to elevation are made at the mean value of all other covariates in the model which means, for instance, that they show the hypothetical response to elevation that would be expected at a constant forest cover of 35%.) As expected, the multinomial mixture model yielded estimates with greater precision than the binomial mixture model (Tables 1 and 2). In particular, SEs for the trend estimates under the binomial were about twice as large as those under the multinomial mixture model.
Table 2. Comparison of parameter estimates (posterior means, SD, lower (2.5%) and upper credible (97.5%) limits) under the binomial and the multinomial mixture models for Swiss willow tits (Parus montanus) 1999–2003  Posterior 

Binomial mixture  Multinomial mixture 

Mean  SD  2.5%  97.5%  Mean  SD  2.5%  97.5% 


loglam0  0.242  0.550  −0.793  1.354  0.080  0.526  −0.933  1.158 
r  0.044  0.046  −0.041  0.140  0.047  0.018  0.009  0.083 
bele1  2.349  0.234  1.903  2.811  2.318  0.234  1.876  2.793 
bele2  −1.221  0.231  −1.660  −0.774  −1.228  0.219  −1.659  −0.813 
bforest  1.293  0.192  0.940  1.702  1.240  0.190  0.878  1.623 
blength  −6.035  2.207  −10.44  −1.824  −4.865  2.208  −9.708  −0.838 
sigma.lam  1.986  0.169  1.686  2.348  1.891  0.155  1.608  2.205 
p0.1999  0.319  0.045  0.229  0.409  0.325  0.024  0.278  0.374 
p0.2000  0.267  0.031  0.209  0.330  0.292  0.021  0.253  0.334 
p0.2001  0.319  0.032  0.258  0.381  0.333  0.021  0.293  0.375 
p0.2002  0.289  0.033  0.229  0.357  0.300  0.021  0.261  0.343 
p0.2003  0.363  0.046  0.276  0.453  0.381  0.021  0.339  0.423 
bday1  −0.172  0.043  −0.255  −0.085  −0.061  0.044  −0.145  0.022 
bday2  0.067  0.029  0.011  0.126  0.056  0.034  −0.009  0.124 
brate  0.213  0.066  0.078  0.340  0.035  0.047  −0.059  0.125 
sigma.p  0.384  0.065  0.257  0.520  0.609  0.050  0.511  0.709 
Results from both analyses concurred well with what we know about these species. The skylark is declining (r < 0) and avoids forest (bforest < 0), while the willow tit is increasing (r > 0) and is a forest bird (bforest > 0; Tables 1 and 2). The skylark was most abundant at lower elevations (Fig. 1c), while the willow tit had an intermediate optimum (Fig. 1d). For both species, abundance showed the expected negative response to increasing inverse route length, although the 95% credible interval for that parameter covered zero in the skylark. For both species, average detection probability varied among years, and the skylark was much easier detectable than the willow tit. Seasonal variation in detection probability was small in both species (Fig. 1e,f). The effect of search effort as measured by survey rate (min km^{−1}) was not quite significant in the frequent singer skylark under the multinomial model, but clearly positive in the less frequent, and weak, singer, the willow tit.
Discussion
 Top of page
 Summary
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 Supporting Information
We presented a comparison of two new hierarchical modelling frameworks to estimate abundance in metapopulation designs with temporal replicate observations. The multinomial mixture model (Dorazio et al. 2005; Royle et al. 2007; Webster et al. 2008) is a multisite, integrated version of the classical multinomial model widely used for capture–recapture data (e.g. Williams et al. 2002). That is, it requires replicate observations of individually recognizable units, such as individuals or, here, territories. These are expensive data, because individual identification may not be possible under all circumstances or may be costly in time or effort. In contrast, the binomial mixture model (Royle 2004a) is based on the integration over multiple sites and replicate visits of counts without individual identification. These are much cheaper data, as just tallying up all detected individuals on each occasion separately will often be hardly more difficult than simply recording presence/absence (actually, detection/nondetection) in an occupancy study (MacKenzie & Nichols 2004).
Both models are well suited for inference about data from metapopulation designs in monitoring and similar studies, where the same observation protocol is applied to an array of spatial replicates (Royle 2004b). By specifying a weak stochastic relationship among the abundance parameters at different sites, they represent a much more flexible and parsimonious way of integrating data across replicate sites than by assuming that they are either all equal or all different (Gelman & Hill 2007). Temporal replicate observations enable one to decompose the observed variation in counts into effects from the unobserved biological process, represented by the abundance parameters, and those from the observation process, represented by the detection parameters.
Notably, we extend most previous applications of the binomial and multinomial mixture model to open populations. This allows us to directly model population trends fully embedded in an estimation framework that accounts for imperfect detection. This would appear to make these models attractive for assessing population change in a changing environment, where not only abundance but also detection in animal or plant populations may change over time.
So which one is the more useful model ? Most results of our comparison were as expected: we found concurring estimates that were, however, more precise under the multinomial mixture model. In addition, unpublished simulation results show that mixing of the Markov chains in a Bayesian analysis is greatly improved for the multinomial as compared to the binomial mixture model. This is most likely due to the higher information content of encounter frequency data compared to replicated counts without individual identification. On the other hand, computational costs were about eight times smaller for the binomial mixture model (which meant hours instead of days on a fast laptop), and this may well be decisive in the analysis of very large data sets. So one might say that where individual encounter data are available, they are best analysed under the multinomial mixture model unless sample sizes are too large.
There is one further interesting issue, though: for both studied species, abundance estimates under the binomial mixture model tended to be higher than under the multinomial mixture model (although the 95% credible intervals of loglam0 under both models overlapped). This was particularly surprising as in a similar comparison, Webster et al. (2008) found a much greater similarity between the abundance estimates under these two model classes than we did. We believe that the discrepancies in the N estimates between binomial and multinomial mixture model may be due to different effects of territory misclassification.
Table 3 shows how errors in territory identification between replicate visits affect the observed encounter history frequency data. Consequently, when applying a multinomial model to encounter history frequency data, lumping errors will bias abundance estimates low and splitting errors will bias them high. Interestingly, when analysing the same encounter histories aggregated to replicated counts under the binomial mixture model, estimates will be unaffected (Table 3). Individual identification can be very difficult, especially for acoustic detections (Alldredge, Simons & Pollock 2007; Simons et al. 2007). Mistakenly attributing records of two birds from different territories to the same territory may thus account for the observed discrepancy between abundance estimates. It would seem that this was more pronounced for the skylark, which has a wideranging songflight where individuals may be more difficult to assign to a territory than for the willow tit. Territory misclassification might also explain the differences between our results and those of Webster et al. (2008). Their occasions were contiguous 3, 2 and 5 min intervals, so presumably there was much less chance for misclassification.
Table 3. Effects of two types of territory misclassification, lumping and splitting, on estimates of abundance under the binomial mixture model, based on territory counts, and on the multinomial mixture model, based on encounter frequencies. In the example, truth (true abundance) is represented by three territories with encounter histories (0,1,1), (1,0,0) and (0,0,1). The lumping error consists of erroneously assuming that detections in territory 2 and 3 represent a single territory. The splitting error consists of mistakenly assuming that the two detections in territory 1 belong to two different territories. Inference under the binomial mixture model is unaffected by either kind of error  Lumping error  Splitting error 

Truth  Observation  Truth  Observation 

Encounter histories  011  011  011  010 
100  101  100  001 
001   001  100 
   001 
Territory counts  112  112  112  112 
As one would expect for a GLMbased class of models (McCullagh & Nelder 1989), our modelling framework is extremely flexible to extensions. First, as an alternative to an overdispersed Poisson distribution (i.e. the lognormal Poisson we assumed for abundance), other distributions could be used to specify the unstructured variation in the latent state, N_{i}, across quadrats i, such as the negative binomial or a zeroinflated Poisson distribution. Secondly, spatial correlation among site random effects may be added (Royle et al. 2007; Webster et al. 2008). Thirdly, one could also employ nonparametric modelling of the abundance distribution, e.g. by use of Dirichlet process priors (Dorazio et al. 2008) or by adding smooth terms as in a generalised additive model (GAM) (Wood 2006). Finally, truly individual effects at the level of the individual territory could be introduced if each individual encounter history is modelled individually. In the context of the Swiss bird survey MHB, territory and surveyspecific covariates that appear useful include daytime (for a finescale modelling of the temporal patterns in detection probability within a morning), or territoryspecific covariates such as the proximity to a road or another noise source (river, torrent, stream) to account for habitatspecific detection. As a special case of an individual effect, the coordinates of each detection could be formally integrated into an analysis for a spatial capture–recapture model (Efford 2004; Borchers & Efford 2008; Royle & Young 2008). In conclusion, where temporal replicate observations are available for at least part of a data set produced in a metapopulation design, we believe that the hierarchical models presented here offer an extremely powerful framework for inference about population dynamics free from the possibly distorting effects of imperfect detection. We would hope that this ability makes a strong argument for obtaining replicate observations in at least a subsample in ecological or applied studies that employ metapopulation designs.
Supporting Information
 Top of page
 Summary
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 Supporting Information
Appendix S1. BUGS model description for the binomial mixture model with Normal random effects in the linear predictors of both abundance and detection
Appendix S2. BUGS model description for the multinomial mixture model with Normal random effects in the linear predictors of both abundance and detection. This code allows survey number to vary between 2 and 3
Appendix S3. How to treat unbalanced data for the multinomial mixture model
Appendix S4. Details on the analysis by Markov chain Monte Carlo
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials may be reorganized for online delivery, but are not copyedited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.