Get access

Bayesian inference of gene–environment interaction from incomplete data: What happens when information on environment is disjoint from data on gene and disease?

Authors

  • Paul Gustafson,

    Corresponding author
    1. Department of Statistics, University of British Columbia, 333-6356 Agricultural Road, Vancouver, BC, Canada V6T 1Z2
    • Department of Statistics, University of British Columbia, 333-6356 Agricultural Road, Vancouver, BC, Canada V6T 1Z2
    Search for more papers by this author
  • Igor Burstyn

    1. Community and Occupational Medicine Program, Department of Medicine, Faculty of Medicine and Dentistry, The University of Alberta, 13-103E Clinical Sciences Building, Edmonton, Alberta, Canada T6G 2G3
    2. Department of Environmental and Occupational Health, School of Public Health, Drexel University, 1505 Race Street, Room 1332, Philadelphia, PA 19102, U.S.A.
    Search for more papers by this author

Abstract

Inference in gene–environment studies can sometimes exploit the assumption of Mendelian randomization that genotype and environmental exposure are independent in the population under study. Moreover, in some such problems it is reasonable to assume that the disease risk for subjects without environmental exposure will not vary with genotype. When both assumptions can be invoked, we consider the prospects for inferring the dependence of disease risk on genotype and environmental exposure (and particularly the extent of any gene–environment interaction), without detailed data on environmental exposure. The data structure envisioned involves data on disease and genotype jointly, but only external information about the distribution of the environmental exposure in the population. This is relevant as for many environmental exposures individual-level measurements are costly and/or highly error-prone. Working in the setting where all relevant variables are binary, we examine the extent to which such data are informative about the interaction, via determination of the large-sample limit of the posterior distribution. The ideas are illustrated using data from a case–control study for bladder cancer involving smoking behaviour and the NAT2 genotype. Copyright © 2011 John Wiley & Sons, Ltd.

Get access to the full text of this article

Ancillary