Volume 28, Issue 2
Research Article

Identifying interacting SNPs using Monte Carlo logic regression

Charles Kooperberg

Corresponding Author

E-mail address: clk@fhcrc.org

Division of Public Health Services, Fred Hutchinson Cancer Research Center, Seattle, Washington

Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, PO Box 19024, Seattle, WA 98109‐1024===Search for more papers by this author
Ingo Ruczinski

Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland

Search for more papers by this author
First published: 05 November 2004
Citations: 138

Abstract

Interactions are frequently at the center of interest in single‐nucleotide polymorphism (SNP) association studies. When interacting SNPs are in the same gene or in genes that are close in sequence, such interactions may suggest which haplotypes are associated with a disease. Interactions between unrelated SNPs may suggest genetic pathways. Unfortunately, data sets are often still too small to definitively determine whether interactions between SNPs occur. Also, competing sets of interactions could often be of equal interest. Here we propose Monte Carlo logic regression, an exploratory tool that combines Markov chain Monte Carlo and logic regression, an adaptive regression methodology that attempts to construct predictors as Boolean combinations of binary covariates such as SNPs. The goal of Monte Carlo logic regression is to generate a collection of (interactions of) SNPs that may be associated with a disease outcome, and that warrant further investigation. As such, the models that are fitted in the Markov chain are not combined into a single model, as is often done in Bayesian model averaging procedures. Instead, the most frequently occurring patterns in these models are tabulated. The method is applied to a study of heart disease with 779 participants and 89 SNPs. A simulation study is carried out to investigate the performance of the Monte Carlo logic regression approach. Genet. Epidemiol. 28:157–170, 2005. © 2004 Wiley‐Liss, Inc.

Number of times cited according to CrossRef: 138

  • Application of logic regression to assess the importance of interactions between components in a network, Reliability Engineering & System Safety, 10.1016/j.ress.2020.107235, 205, (107235), (2021).
  • Discovery Among Binary Biomarkers in Heterogeneous Populations, Statistical Modeling in Biomedical Research, 10.1007/978-3-030-33416-1_11, (213-232), (2020).
  • Determining dependency and redundancy for identifying gene-gene interaction associated with complex disease, Journal of Bioinformatics and Computational Biology, 10.1142/S0219720020500353, (2020).
  • Improved Classification Method for Detecting Potential Interactions Between Genes, Intelligent Computing, 10.1007/978-3-030-01174-1_29, (394-403), (2019).
  • Measuring gene–gene interaction using Kullback–Leibler divergence, Annals of Human Genetics, 10.1111/ahg.12324, 83, 6, (405-417), (2019).
  • Gene–gene interaction among cell adhesion genes and risk of nonsyndromic cleft lip with or without cleft palate in Chinese case‐parent trios, Molecular Genetics & Genomic Medicine, 10.1002/mgg3.872, 7, 10, (2019).
  • Parallel repulsive logic regression with biological adjacency, Biostatistics, 10.1093/biostatistics/kxz011, (2019).
  • Evaluating the impact of policies recommending PrEP to subpopulations of men and transgender women who have sex with men based on demographic and behavioral risk factors, PLOS ONE, 10.1371/journal.pone.0222183, 14, 9, (e0222183), (2019).
  • A Comparison Study of Algorithms to Detect Drug–Adverse Event Associations: Frequentist, Bayesian, and Machine-Learning Approaches, Drug Safety, 10.1007/s40264-018-00792-0, (2019).
  • Defining and Discovering Interactive Causes, Advances in Biomedical Informatics, 10.1007/978-3-319-67513-8_4, (53-78), (2018).
  • A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise, International Journal of Medical Informatics, 10.1016/j.ijmedinf.2018.09.003, 119, (134-151), (2018).
  • Detecting gene-gene interactions for complex quantitative traits using generalized fuzzy classification, BMC Bioinformatics, 10.1186/s12859-018-2361-5, 19, 1, (2018).
  • undefined, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 10.1109/BIBM.2018.8621487, (2182-2189), (2018).
  • On the inherent competition between valid and spurious inductive inferences in Boolean data, International Journal of Modern Physics C, 10.1142/S0129183117501467, 28, 12, (1750146), (2018).
  • Multiple Testing of Submatrices of a Precision Matrix With Applications to Identification of Between Pathway Interactions, Journal of the American Statistical Association, 10.1080/01621459.2016.1251930, 113, 521, (328-339), (2017).
  • An empirical fuzzy multifactor dimensionality reduction method for detecting gene-gene interactions, BMC Genomics, 10.1186/s12864-017-3496-x, 18, S2, (2017).
  • Automatic identification of variables in epidemiological datasets using logic regression, BMC Medical Informatics and Decision Making, 10.1186/s12911-017-0429-1, 17, 1, (2017).
  • Grid-based stochastic search for hierarchical gene-gene interactions in population-based genetic studies of common human diseases, BioData Mining, 10.1186/s13040-017-0139-3, 10, 1, (2017).
  • Adapted Infinite Kernel Learning by Multi-Local Algorithm, International Journal of Pattern Recognition and Artificial Intelligence, 10.1142/S0218001416510046, 30, 04, (1651004), (2016).
  • Variants and haplotypes within MEF2C gene influence stature of chinese native cattle including body dimensions and weight, Livestock Science, 10.1016/j.livsci.2016.01.008, 185, (106-109), (2016).
  • Tests for Gene-Environment Interactions and Joint Effects With Exposure Misclassification, American Journal of Epidemiology, 10.1093/aje/kwv198, 183, 3, (237-247), (2016).
  • Transition Logic Regression Method to Identify Interactions in Binary Longitudinal Data, Open Journal of Statistics, 10.4236/ojs.2016.63042, 06, 03, (469-481), (2016).
  • undefined, 2016 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 10.1109/CIBCB.2016.7758094, (1-6), (2016).
  • Complex diseases SNP selection and classification by hybrid Association Rule Mining and Artificial Neural Network—based Evolutionary Algorithms, Engineering Applications of Artificial Intelligence, 10.1016/j.engappai.2016.01.004, 51, (58-70), (2016).
  • Statistical Analysis of GWAS, Phenotypes and Genotypes, 10.1007/978-1-4471-5310-8_5, (105-161), (2016).
  • Logic models to predict continuous outputs based on binary inputs with an application to personalized cancer therapy, Scientific Reports, 10.1038/srep36812, 6, 1, (2016).
  • Discovering causal interactions using Bayesian network scoring and information gain, BMC Bioinformatics, 10.1186/s12859-016-1084-8, 17, 1, (2016).
  • A Deep Learning Approach to Detect SNP Interactions, Journal of Software, 10.17706/jsw.11.10.965-975, 11, 10, (965-975), (2016).
  • LEAP: Biomarker Inference Through Learning and Evaluating Association Patterns, Genetic Epidemiology, 10.1002/gepi.21889, 39, 3, (173-184), (2015).
  • Gene–environment interactions in inflammatory bowel disease pathogenesis, Current Opinion in Gastroenterology, 10.1097/MOG.0000000000000188, 31, 4, (277-282), (2015).
  • Evaluation of a two-stage framework for prediction using big genomic data, Briefings in Bioinformatics, 10.1093/bib/bbv010, 16, 6, (912-921), (2015).
  • Testing differential networks with applications to the detection of gene-gene interactions, Biometrika, 10.1093/biomet/asu074, 102, 2, (247-266), (2015).
  • Learning Predictive Interactions Using Information Gain and Bayesian Network Scoring, PLOS ONE, 10.1371/journal.pone.0143247, 10, 12, (e0143247), (2015).
  • Research on Single Nucleotide Polymorphisms Interaction Detection from Network Perspective, PLOS ONE, 10.1371/journal.pone.0119146, 10, 3, (e0119146), (2015).
  • An overview of SNP interactions in genome-wide association studies, Briefings in Functional Genomics, 10.1093/bfgp/elu036, 14, 2, (143-155), (2014).
  • undefined, 2014 IEEE International Conference on Bioinformatics and Bioengineering, 10.1109/BIBE.2014.29, (329-333), (2014).
  • undefined, 2014 IEEE International Conference on Big Data (Big Data), 10.1109/BigData.2014.7004271, (539-548), (2014).
  • A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets, Journal of the American Medical Informatics Association, 10.1136/amiajnl-2013-002358, 21, e2, (e312-e319), (2014).
  • An evaluation of Monte-Carlo logic and logicFS motivated by a study of the regulation of gene expression in heart failure, Journal of Applied Statistics, 10.1080/02664763.2014.898133, 41, 9, (1956-1975), (2014).
  • Introduction to Statistical Methods for MicroRNA Analysis, miRNomics: MicroRNA Biology and Computational Analysis, 10.1007/978-1-62703-748-8_8, (129-155), (2014).
  • Bayesian Systems-Based Genetic Association Analysis with Effect Strength Estimation and Omic Wide Interpretation: A Case Study in Rheumatoid Arthritis, Arthritis Research, 10.1007/978-1-4939-0404-4_14, (143-176), (2014).
  • On the choice of degrees of freedom for testing gene–gene interactions, Statistics in Medicine, 10.1002/sim.6264, 33, 28, (4934-4948), (2014).
  • Detection of epistatic effects with logic regression and a classical linear regression model, Statistical Applications in Genetics and Molecular Biology, 10.1515/sagmb-2013-0028, 13, 1, (2014).
  • A Bayesian clustering approach for detecting gene-gene interactions in high-dimensional genotype data, Statistical Applications in Genetics and Molecular Biology, 10.1515/sagmb-2012-0074, 13, 3, (2014).
  • Role for protein–protein interaction databases in human genetics, Expert Review of Proteomics, 10.1586/epr.09.86, 6, 6, (647-659), (2014).
  • Using Data-Driven Rules to Predict Mortality in Severe Community Acquired Pneumonia, PLoS ONE, 10.1371/journal.pone.0089053, 9, 4, (e89053), (2014).
  • Forward LASSO analysis for high-order interactions in genome-wide association study, Briefings in Bioinformatics, 10.1093/bib/bbt037, 15, 4, (552-561), (2013).
  • Construction and analysis of single nucleotide polymorphism–single nucleotide polymorphism interaction networks, IET Systems Biology, 10.1049/iet-syb.2012.0055, 7, 5, (170-181), (2013).
  • Construction of gene clusters resembling genetic causal mechanisms for common complex disease with an application to young-onset hypertension, BMC Genomics, 10.1186/1471-2164-14-497, 14, 1, (497), (2013).
  • Identification of ovarian cancer associated genes using an integrated approach in a Boolean framework, BMC Systems Biology, 10.1186/1752-0509-7-12, 7, 1, (12), (2013).
  • Addressing the Challenges of Detecting Epistasis in Genome-Wide Association Studies of Common Human Diseases Using Biological Expert Knowledge, Bioinformatics, 10.4018/978-1-4666-3604-0, (725-744), (2013).
  • Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression, BioMed Research International, 10.1155/2013/574735, 2013, (1-11), (2013).
  • Logistic Regression in Genomewide Association Analysis, Biological Knowledge Discovery Handbook, 10.1002/9781118617151, (477-500), (2013).
  • Identification of multiple gene-gene interactions for ordinal phenotypes, BMC Medical Genomics, 10.1186/1755-8794-6-S2-S9, 6, S2, (2013).
  • PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data, PLoS Computational Biology, 10.1371/journal.pcbi.1003101, 9, 6, (e1003101), (2013).
  • Application of Crossover Analysis-logistic Regression in the Assessment of Gene- environmental Interactions for Colorectal Cancer, Asian Pacific Journal of Cancer Prevention, 10.7314/APJCP.2012.13.5.2031, 13, 5, (2031-2037), (2012).
  • The success of pharmacogenomics in moving genetic association studies from bench to bedside: study design and implementation of precision medicine in the post-GWAS era, Human Genetics, 10.1007/s00439-012-1221-z, 131, 10, (1615-1626), (2012).
  • Recommendations and proposed guidelines for assessing the cumulative evidence on joint effects of genes and environments on cancer occurrence in humans, International Journal of Epidemiology, 10.1093/ije/dys010, 41, 3, (686-704), (2012).
  • NMDA receptor genotypes associated with the vulnerability to develop dyskinesia, Translational Psychiatry, 10.1038/tp.2011.66, 2, 1, (e67-e67), (2012).
  • Methods for SNP Regression Analysis in Clinical Studies, Handbook of Statistics in Clinical Oncology, Third Edition, 10.1201/b11800, (591-604), (2012).
  • Pathway analysis of genome-wide association study data highlights pancreatic development genes as susceptibility factors for pancreatic cancer, Carcinogenesis, 10.1093/carcin/bgs151, 33, 7, (1384-1390), (2012).
  • Melanocortin-1 receptor, skin cancer and phenotypic characteristics (M-SKIP) project: study design and methods for pooling results of genetic epidemiological studies, BMC Medical Research Methodology, 10.1186/1471-2288-12-116, 12, 1, (2012).
  • Gene-Gene Interactions, Analysis of Genetic Association Studies, 10.1007/978-1-4614-2245-7_8, (235-256), (2012).
  • Next generation analytic tools for large scale genetic epidemiology studies of complex diseases, Genetic Epidemiology, 10.1002/gepi.20652, 36, 1, (22-35), (2011).
  • Travelling the world of gene-gene interactions, Briefings in Bioinformatics, 10.1093/bib/bbr012, 13, 1, (1-19), (2011).
  • Latent variable modeling paradigms for genotype‐trait association studies, Biometrical Journal, 10.1002/bimj.201000218, 53, 5, (838-854), (2011).
  • Addressing the Challenges of Detecting Epistasis in Genome-Wide Association Studies of Common Human Diseases Using Biological Expert Knowledge, Handbook of Research on Computational and Systems Biology, 10.4018/978-1-60960-491-2, (128-147), (2011).
  • Comparative analysis of methods for detecting interacting loci, BMC Genomics, 10.1186/1471-2164-12-344, 12, 1, (2011).
  • A unified framework for multi-locus association analysis of both common and rare variants, BMC Genomics, 10.1186/1471-2164-12-89, 12, 1, (2011).
  • undefined, 2011 IEEE International Conference on Bioinformatics and Biomedicine, 10.1109/BIBM.2011.103, (83-88), (2011).
  • Methods for Identifying SNP Interactions: A Review on Variations of Logic Regression, Random Forest and Bayesian Logistic Regression, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10.1109/TCBB.2011.46, 8, 6, (1580-1591), (2011).
  • Modeling of environmental and genetic interactions with AMBROSIA, an information-theoretic model synthesis method, Heredity, 10.1038/hdy.2011.18, 107, 4, (320-327), (2011).
  • undefined, 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), 10.1109/ICCABS.2011.5729874, (171-177), (2011).
  • R Statistical Tools for Gene Discovery, In Silico Tools for Gene Discovery, 10.1007/978-1-61779-176-5_5, (73-90), (2011).
  • Importance Measures for Epistatic Interactions in Case‐Parent Trios, Annals of Human Genetics, 10.1111/j.1469-1809.2010.00623.x, 75, 1, (122-132), (2010).
  • Testing SNPs and sets of SNPs for importance in association studies, Biostatistics, 10.1093/biostatistics/kxq042, 12, 1, (18-32), (2010).
  • Using biological knowledge to discover higher order interactions in genetic association studies, Genetic Epidemiology, 10.1002/gepi.20542, 34, 8, (863-878), (2010).
  • Logic Forest: an ensemble classifier for discovering logical combinations of binary markers, Bioinformatics, 10.1093/bioinformatics/btq354, 26, 17, (2183-2189), (2010).
  • Evidence of statistical epistasis between DISC1, CIT and NDEL1 impacting risk for schizophrenia: biological validation with functional neuroimaging, Human Genetics, 10.1007/s00439-009-0782-y, 127, 4, (441-452), (2010).
  • Methods: Genetic Epidemiology, Clinics in Laboratory Medicine, 10.1016/j.cll.2010.07.002, 30, 4, (795-814), (2010).
  • Inferring combinatorial association logic networks in multimodal genome-wide screens, Bioinformatics, 10.1093/bioinformatics/btq211, 26, 12, (i149-i157), (2010).
  • Candidate genes and their interactions with other genetic/environmental risk factors in the etiology of schizophrenia, Brain Research Bulletin, 10.1016/j.brainresbull.2009.08.023, 83, 3-4, (86-92), (2010).
  • Prediction of SNP interactions in complex diseases with mutual information and boolean algebra, Journal of the Korea Society of Computer and Information, 10.9708/jksci.2010.15.11.215, 15, 11, (215-224), (2010).
  • Genetic Approaches to Functional Gastrointestinal Disorders, Gastroenterology, 10.1053/j.gastro.2010.02.037, 138, 4, (1276-1285), (2010).
  • Using Principal Components of Genetic Variation for Robust and Powerful Detection of Gene-Gene Interactions in Case-Control and Case-Only Studies, The American Journal of Human Genetics, 10.1016/j.ajhg.2010.01.026, 86, 3, (331-342), (2010).
  • Statistical Methods for Pathway Analysis of Genome-Wide Data for Association with Complex Genetic Traits, Computational Methods for Genetics of Complex Traits, 10.1016/B978-0-12-380862-2.00007-2, (141-179), (2010).
  • Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application, The American Journal of Human Genetics, 10.1016/j.ajhg.2009.11.017, 86, 1, (6-22), (2010).
  • undefined, 2010 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT Europe), 10.1109/ISGTEUROPE.2010.5638961, (1-8), (2010).
  • Bioinformatics challenges for genome-wide association studies, Bioinformatics, 10.1093/bioinformatics/btp713, 26, 4, (445-455), (2010).
  • Methods: Genetic Epidemiology, Psychiatric Clinics of North America, 10.1016/j.psc.2009.12.005, 33, 1, (15-34), (2010).
  • Methods for Investigating Gene-Environment Interactions in Candidate Pathway and Genome-Wide Association Studies, Annual Review of Public Health, 10.1146/annurev.publhealth.012809.103619, 31, 1, (21-36), (2010).
  • Genome-wide association study (GWAS)-identified disease risk alleles do not compromise human longevity, Proceedings of the National Academy of Sciences, 10.1073/pnas.1003540107, 107, 42, (18046-18049), (2010).
  • Logic Regression and Its Extensions, Computational Methods for Genetics of Complex Traits, 10.1016/B978-0-12-380862-2.00002-3, (25-45), (2010).
  • An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findings, BMC Genetics, 10.1186/1471-2156-11-49, 11, 1, (2010).
  • A Markov blanket-based method for detecting causal SNPs in GWAS, BMC Bioinformatics, 10.1186/1471-2105-11-S3-S5, 11, S3, (2010).
  • Efficient Bayesian approach for multilocus association mapping including gene-gene interactions, BMC Bioinformatics, 10.1186/1471-2105-11-443, 11, 1, (2010).
  • Detection of SNP‐SNP interactions in trios of parents with schizophrenic children, Genetic Epidemiology, 10.1002/gepi.20488, 34, 5, (396-406), (2010).
  • Discovery of complex pathways from observational data, Statistics in Medicine, 10.1002/sim.3962, 29, 19, (1998-2011), (2010).
  • Predictive rule inference for epistatic interaction detection in genome-wide association studies, Bioinformatics, 10.1093/bioinformatics/btp622, 26, 1, (30-37), (2009).
  • Bayesian mixture modeling of gene‐environment and gene‐gene interactions, Genetic Epidemiology, 10.1002/gepi.20429, 34, 1, (16-25), (2009).
  • See more

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.