Combining information in hierarchical models improves inferences in population ecology and demographic population analyses



Michael Schaub, Swiss Ornithological Institute, 6204 Sempach, Switzerland



Read the Feature Paper: Bayesian shared frailty models for regional inference about wildlife survival

Other Commentaries on this paper: Bayesian shared frailty models for regional inference about wildlife survival; ‘Each site has its own survival probability, but information is borrowed across sites to tell us about survival in each site’: random effects models as means of borrowing strength in survival studies of wild vertebrates

Response from the authors: ‘Exciting statistics’: the rapid development and promising future of hierarchical models for population ecology

Statistical models that include random effects are becoming more common in population ecology all the time. Random effects are two or more effects of some grouping factor that belong together, in the sense that we can imagine they were generated by a common stochastic process. Thus, they are similar, but not identical, and therefore assumed to be exchangeable. Models with random effects contain as a description of this common stochastic process so-called prior distributions, with estimable hyperparameters, for instance the mean and the variance for normally distributed random effects. Models with random effects have intrinsically more than one level; therefore, they are often called hierarchical models (Royle & Dorazio, 2008) or multilevel models (Gelman & Hill, 2007). Interest in hierarchical models often focuses on the hyperparameters, but the realizations from the process described by the prior distributions, that is, the random effects, can also be estimated. The main difference to treating a set of effects as fixed is that in a random effects model, the parameters of a grouping factor are no longer estimated independently. Rather, the assumption that they come from a common prior distribution induces a dependence, which means that the estimate of each is somewhat influenced by the estimate of all other effects comprising that factor. This is often called borrowing strength from the ensemble. One consequence of treating a factor as random is that its effect estimates are pulled in towards the overall mean, which is called shrinkage in the literature. If the assumption of a common stochastic process is reasonable, borrowing strength and shrinkage typically leads to better estimates compared with their fixed effects counterparts (Gelman, 2005; chapter 4 in Kéry & Schaub, 2012). Hierarchical models are not at all intrinsically Bayesian; rather, they can be analysed using likelihood or Bayesian methods (Royle & Dorazio, 2008), though Bayesian analysis is often easier, especially for ecologists using MCMC engines like WinBUGS (Lunn et al., 2000) or JAGS (Plummer, 2003). One particularly interesting use of hierarchical model is as a formal way of combining information, for instance, coming from different studies.

A good example is the feature paper by Halstead et al. (2012), where estimates of survival probabilities of a snake species from radio-tracking data at multiple study sites in the same region are combined in a hierarchical model that is fitted using the WinBUGS software. Sample size at each site was fairly low. Hence, Halstead and colleagues treated each study site as a replicate and estimated the hyperparameters of the distribution that collects together the site-specific parameters, which were treated as random effects. Hierarchical modelling thus provides a formal way of combining this information, where study sites with more information (e.g. more snakes or with more precise estimates) get more weight in contributing to the estimates of the overall mean and the among-site variability (the mean and the variance hyperparameters). Site-specific estimates are shrunken towards the mean, where the degree of shrinkage depends on the precision of the site-specific estimates. This is a desirable property, as it avoids possible overinterpretation of patterns that may be due to some idiosyncrasy of a small sample. Borrowing strength typically also helps against boundary estimates that are frequent in survival analysis with small sample size. Thus, not only were Halstead and colleagues able to obtain a formal estimate of the regional mean of the interesting parameters, but arguably they also obtained better site-specific estimates owing to the sharing of information among all sites. In a sense, what they did is a meta-analysis. A further advantage of combining data from several studies is increased precision of the site-specific estimates, because they borrow strength from the entire dataset. Increased precision means to have more power to detect an effect, which is especially advantageous when dealing with rare species, where sample size is typically small to very small. A similar approach was chosen by Papadatou et al. (2012) who estimated survival probabilities of several bat species (i.e. where species was the grouping factor), based on capture-recapture data sampled at different locations.

The combination of information is an exciting topic in ecological statistics with important benefits. Typically, it enables one to estimate parameters more precisely, to estimate some parameters that may not be estimable otherwise (Abadi et al., 2010) and it represents a statistical synthesis of all data available on some phenomenon, that is, a meta-analysis (Gurevitch, Curtis & Jones, 2001). Often, information is combined where each piece is informative on the exact same ecological quantity, such as a survival probability. So-called integrated population models (Besbeas et al., 2002; Abadi et al., 2010) represent this case, where both time-series of counts and a ring-recovery or capture-recapture data set contain information about survival. Such datasets are combined in a deterministic way, by simply specifying the same name for a parameter in the statistical description of each dataset. In other cases, such as in the featured study by Halstead et al. (2012), information is combined that is not identical, but ‘similar’, in the random-effects sense discussed previously. Both types of synthesis of information have a huge scope in population ecology and related disciplines such as conservation and wildlife management.

As with hierarchical models in general, such synthetical models can be fitted both in a classical and in a Bayesian framework. However, with the advent of automatic Bayesian inference engines such as WinBUGS and JAGS, we believe it is fair to say that for most ecologists, a Bayesian analysis of such usually highly non-standard models is far more accessible than a likelihood analysis, which would require much more statistical and computational knowledge. WinBUGS and JAGS have this wonderfully simple, conditional way of specifying even complicated hierarchical models in a clear and simple model definition language; the BUGS language (Lunn et al., 2000). We believe that BUGS and JAGS have the potential to really free the modeller in many population ecologists (Kéry, 2010). We expect to see many more applications of reasonably non-standard hierarchical models in population ecology in the future.