These authors contributed equally to this work.
Encore: Genetic Association Interaction Network Centrality Pipeline and Application to SLE Exome Data
Article first published online: 5 JUN 2013
© 2013 WILEY PERIODICALS, INC.
Volume 37, Issue 6, pages 614–621, September 2013
How to Cite
Davis, N. A., Lareau, C. A., White, B. C., Pandey, A., Wiley, G., Montgomery, C. G., Gaffney, P. M. and McKinney, B. A. (2013), Encore: Genetic Association Interaction Network Centrality Pipeline and Application to SLE Exome Data. Genet. Epidemiol., 37: 614–621. doi: 10.1002/gepi.21739
- Issue published online: 11 AUG 2013
- Article first published online: 5 JUN 2013
- Manuscript Accepted: 30 APR 2013
- Manuscript Revised: 19 MAR 2013
- Manuscript Received: 9 JAN 2013
- epistasis network;
- machine learning;
- network analysis;
- network centrality;
- Systemic Lupus Erythematous
Open source tools are needed to facilitate the construction, analysis, and visualization of gene-gene interaction networks for sequencing data. To address this need, we present Encore, an open source network analysis pipeline for genome-wide association studies and rare variant data. Encore constructs Genetic Association Interaction Networks or epistasis networks using two optional approaches: our previous information-theory method or a generalized linear model approach. Additionally, Encore includes multiple data filtering options, including Random Forest/Random Jungle for main effect enrichment and Evaporative Cooling and Relief-F filters for enrichment of interaction effects. Encore implements SNPrank network centrality for identifying susceptibility hubs (nodes containing a large amount of disease susceptibility information through the combination of multivariate main effects and multiple gene-gene interactions in the network), and it provides appropriate files for interactive visualization of a network using tools from our online Galaxy instance. We implemented these algorithms in C++ using OpenMP for shared-memory parallel analysis on a server or desktop. To demonstrate Encore's utility in analysis of genetic sequencing data, we present an analysis of exome resequencing data from healthy individuals and those with Systemic Lupus Erythematous (SLE). Our results verify the importance of the previously associated SLE genes HLA-DRB and NCF2, and these two genes had the highest gene-gene interaction degrees among the susceptibility hubs. An additional 14 genes previously associated with SLE emerged in our epistasis network model of the exome data, and three novel candidate genes, ST8SIA4, CMTM4, and C2CD4B, were implicated in the model. In summary, we present a comprehensive tool for epistasis network analysis and the first such analysis of exome data from a genetic study of SLE.