Volume 34, Issue 30
Research Article

A penalized robust semiparametric approach for gene–environment interactions

Cen Wu

Department of Biostatistics, School of Public Health, Yale University, 60 College Street, New Haven, CT, 06520 U.S.A.

Department of Statistics, Kansas State University, 1116 Mid‐Campus Drive N., Manhattan, KS, 66506 U.S.A.

Search for more papers by this author
Xingjie Shi

Department of Statistics, Nanjing University of Finance and Economics, Nanjing, China

Search for more papers by this author
Yuehua Cui

Department of Statistics and Probability, Michigan State University, 619 Red Cedar Rd, East Lansing, MI, 48824 U.S.A.

Search for more papers by this author
Shuangge Ma

Corresponding Author

Department of Biostatistics, School of Public Health, Yale University, 60 College Street, New Haven, CT, 06520 U.S.A.

VA Cooperative Studies Program Coordinating Center, West Haven, CT, 06516 U.S.A.

Correspondence to: Shuangge Ma, Department of Biostatistics, School of Public Health, Yale University, 60 College Street, New Haven, CT 06520, U.S.A.

E‐mail: Shuangge.ma@yale.edu

Search for more papers by this author
First published: 03 August 2015
Citations: 12

Abstract

In genetic and genomic studies, gene‐environment (G×E) interactions have important implications. Some of the existing G×E interaction methods are limited by analyzing a small number of G factors at a time, by assuming linear effects of E factors, by assuming no data contamination, and by adopting ineffective selection techniques. In this study, we propose a new approach for identifying important G×E interactions. It jointly models the effects of all E and G factors and their interactions. A partially linear varying coefficient model is adopted to accommodate possible nonlinear effects of E factors. A rank‐based loss function is used to accommodate possible data contamination. Penalization, which has been extensively used with high‐dimensional data, is adopted for selection. The proposed penalized estimation approach can automatically determine if a G factor has an interaction with an E factor, main effect but not interaction, or no effect at all. The proposed approach can be effectively realized using a coordinate descent algorithm. Simulation shows that it has satisfactory performance and outperforms several competing alternatives. The proposed approach is used to analyze a lung cancer study with gene expression measurements and clinical variables. Copyright © 2015 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 12

  • Model identification and selection for single-index varying-coefficient models, Annals of the Institute of Statistical Mathematics, 10.1007/s10463-020-00757-0, (2020).
  • Semiparametric Bayesian variable selection for gene‐environment interactions, Statistics in Medicine, 10.1002/sim.8434, 39, 5, (617-638), (2019).
  • Robust network‐based regularization and variable selection for high‐dimensional genomic data in cancer prognosis, Genetic Epidemiology, 10.1002/gepi.22194, 43, 3, (276-291), (2019).
  • Using simulation studies to evaluate statistical methods, Statistics in Medicine, 10.1002/sim.8086, 38, 11, (2074-2102), (2019).
  • Penalized integrative semiparametric interaction analysis for multiple genetic datasets, Statistics in Medicine, 10.1002/sim.8172, 38, 17, (3221-3242), (2019).
  • Robust semiparametric gene‐environment interaction analysis using sparse boosting, Statistics in Medicine, 10.1002/sim.8322, 38, 23, (4625-4641), (2019).
  • A Selective Review of Multi-Level Omics Data Integration Using Variable Selection, High-Throughput, 10.3390/ht8010004, 8, 1, (4), (2019).
  • Penalized Variable Selection for Lipid–Environment Interactions in a Longitudinal Lipidomics Study, Genes, 10.3390/genes10121002, 10, 12, (1002), (2019).
  • Robust gene–environment interaction analysis using penalized trimmed regression, Journal of Statistical Computation and Simulation, 10.1080/00949655.2018.1523411, 88, 18, (3502-3528), (2018).
  • Robust genetic interaction analysis, Briefings in Bioinformatics, 10.1093/bib/bby033, (2018).
  • Dissecting gene‐environment interactions: A penalized robust approach accounting for hierarchical structures, Statistics in Medicine, 10.1002/sim.7518, 37, 3, (437-456), (2017).
  • Semivarying coefficient least-squares support vector regression for analyzing high-dimensional gene-environmental data, Journal of Applied Statistics, 10.1080/02664763.2017.1371676, 45, 8, (1370-1381), (2017).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.