Regression modelling of correlated data in ecology: subject-specific and population averaged response patterns
Article first published online: 31 JUL 2009
© 2009 The Authors. Journal compilation © 2009 British Ecological Society
Journal of Applied Ecology
Volume 46, Issue 5, pages 1018–1025, October 2009
How to Cite
Fieberg, J., Rieger, R. H., Zicus, M. C. and Schildcrout, J. S. (2009), Regression modelling of correlated data in ecology: subject-specific and population averaged response patterns. Journal of Applied Ecology, 46: 1018–1025. doi: 10.1111/j.1365-2664.2009.01692.x
- Issue published online: 1 OCT 2009
- Article first published online: 31 JUL 2009
- Received 21 May 2009; accepted 1 July 2009 Handling Editor: Paul Lukacs
- conditional model;
- generalized estimating equations;
- generalized linear-mixed models;
- marginal model;
- mixed effects;
- random effects;
- sandwich estimators
1. Statistical methods that assume independence among observations result in optimistic estimates of uncertainty when applied to correlated data, which are ubiquitous in applied ecological research. Mixed effects models offer a potential solution and rely on the assumption that latent or unobserved characteristics of individuals (i.e. random effects) induce correlation among repeated measurements. However, careful consideration must be given to the interpretation of parameters when using a nonlinear link function (e.g. logit). Mixed model regression parameters reflect the change in the expected response within an individual associated with a change in that individual’s covariates [i.e. a subject-specific (SS) interpretation], which may not address a relevant scientific question. In particular, a SS interpretation is not natural for covariates that do not vary within individuals (e.g. gender).
2. An alternative approach combines the solution to an unbiased estimating equation with robust measures of uncertainty to make inferences regarding predictor–outcome relationships. Regression parameters describe changes in the average response among groups of individuals differing in their covariates [i.e. a population-averaged (PA) interpretation].
3. We compare these two approaches [mixed models and generalized estimating equations (GEE)] with illustrative examples from a 3-year study of mallard (Anas platyrhynchos) nest structures. We observe that PA and SS responses differ when modelling binary data, with PA parameters behaving like attenuated versions of SS parameters. Differences between SS and PA parameters increase with the size of among-subject heterogeneity captured by the random effects variance component. Lastly, we illustrate how PA inferences can be derived (post hoc) from fitted generalized and nonlinear-mixed models.
4. Synthesis and applications. Mixed effects models and GEE offer two viable approaches to modelling correlated data. The preferred method should depend primarily on the research question (i.e. desired parameter interpretation), although operating characteristics of the associated estimation procedures should also be considered. Many applied questions in ecology, wildlife management and conservation biology (including the current illustrative examples) focus on population performance measures (e.g. mean survival or nest success rates) as a function of general landscape features, for which the PA model interpretation, not the more commonly used SS model interpretation may be more natural.