A Latent Variable Approach to Study Gene–Environment Interactions in the Presence of Multiple Correlated Exposures
Article first published online: 28 SEP 2011
© 2011, The International Biometric Society
Volume 68, Issue 2, pages 466–476, June 2012
How to Cite
Sánchez, B. N., Kang, S. and Mukherjee, B. (2012), A Latent Variable Approach to Study Gene–Environment Interactions in the Presence of Multiple Correlated Exposures. Biometrics, 68: 466–476. doi: 10.1111/j.1541-0420.2011.01677.x
- Issue published online: 26 JUN 2012
- Article first published online: 28 SEP 2011
- Received December 2010. Revised May 2011. Accepted July 2011.
- Gene–environment independence;
- Principal components;
- Shrinkage estimation;
- Structural equation models
Summary Many existing cohort studies initially designed to investigate disease risk as a function of environmental exposures have collected genomic data in recent years with the objective of testing for gene–environment interaction (G × E) effects. In environmental epidemiology, interest in G × E arises primarily after a significant effect of the environmental exposure has been documented. Cohort studies often collect rich exposure data; as a result, assessing G × E effects in the presence of multiple exposure markers further increases the burden of multiple testing, an issue already present in both genetic and environment health studies. Latent variable (LV) models have been used in environmental epidemiology to reduce dimensionality of the exposure data, gain power by reducing multiplicity issues via condensing exposure data, and avoid collinearity problems due to presence of multiple correlated exposures. We extend the LV framework to characterize gene–environment interaction in presence of multiple correlated exposures and genotype categories. Further, similar to what has been done in case–control G × E studies, we use the assumption of gene–environment (G-E) independence to boost the power of tests for interaction. The consequences of making this assumption, or the issue of how to explicitly model G-E association has not been previously investigated in LV models. We postulate a hierarchy of assumptions about the LV model regarding the different forms of G-E dependence and show that making such assumptions may influence inferential results on the G, E, and G × E parameters. We implement a class of shrinkage estimators to data adaptively trade-off between the most restrictive to most flexible form of G-E dependence assumption and note that such class of compromise estimators can serve as a benchmark of model adequacy in LV models. We demonstrate the methods with an example from the Early Life Exposures in Mexico City to Neuro-Toxicants Study of lead exposure, iron metabolism genes, and birth weight.