Bayes computation for ecological inference

Authors

  • Jon Wakefield,

    Corresponding author
    1. Department of Statistics, University of Washington, Seattle, WA, U.S.A.
    2. Department of Biostatistics, University of Washington, Seattle, WA, U.S.A.
    • Department of Statistics, University of Washington, Seattle, WA, U.S.A.
    Search for more papers by this author
  • Sebastien Haneuse,

    1. Department of Biostatistics, Harvard School of Public Health, Boston, MA, U.S.A.
    Search for more papers by this author
  • Adrian Dobra,

    1. Department of Statistics, University of Washington, Seattle, WA, U.S.A.
    2. Department of Biobehavioral Nursing and Health Systems and Center for Statistics and the Social Sciences, University of Washington, Seattle, WA, U.S.A.
    3. Center for Statistics and the Social Sciences, University of Washington, Seattle, WA, U.S.A.
    Search for more papers by this author
  • Elizabeth Teeple

    1. Biostatistics Unit, Group Health Research Institute, Seattle, WA, U.S.A.
    Search for more papers by this author

Abstract

Ecological data are available at the level of the group, rather than at the level of the individual. The use of ecological data in spatial epidemiological investigations is particularly common. Although the computational methods described are more generally applicable, this paper concentrates on the situation in which the margins of 2 × 2 tables are observed in each of n geographical areas, with a Bayesian approach to inference. We consider auxiliary schemes that impute the missing data, and compare with a previously suggested normal approximation. The analysis of ecological data is subject to ecological bias, with the only reliable means of removing such bias being the addition of auxiliary individual-level information. Various schemes have been suggested for this supplementation, and we illustrate how the computational methods may be applied to the analysis of such enhanced data. The methods are illustrated using simulated data and two examples. In the first example, the ecological data are supplemented with a simple random sample of individual-level data, and in this example the normal approximation fails. In the second example case–control sampling provides the additional information. Copyright © 2011 John Wiley & Sons, Ltd.

Ancillary