Get access

Encore: Genetic Association Interaction Network Centrality Pipeline and Application to SLE Exome Data

Authors

  • Nicholas A. Davis,

    1. Tandy School of Computer Science, University of Tulsa, Tulsa, Oklahoma
    Current affiliation:
    1. Medical Informatics, University of Oklahoma-Tulsa, Tulsa, OK
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Caleb A. Lareau,

    1. Department of Mathematics, University of Tulsa, Tulsa, Oklahoma
    2. Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Bill C. White,

    1. Tandy School of Computer Science, University of Tulsa, Tulsa, Oklahoma
    Search for more papers by this author
  • Ahwan Pandey,

    1. Tandy School of Computer Science, University of Tulsa, Tulsa, Oklahoma
    Search for more papers by this author
  • Graham Wiley,

    1. Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma
    Search for more papers by this author
  • Courtney G. Montgomery,

    1. Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma
    Search for more papers by this author
  • Patrick M. Gaffney,

    1. Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma
    Search for more papers by this author
  • B. A. McKinney

    Corresponding author
    1. Tandy School of Computer Science, University of Tulsa, Tulsa, Oklahoma
    2. Department of Mathematics, University of Tulsa, Tulsa, Oklahoma
    • Correspondence to: B. A. McKinney, Tandy School of Computer Science, University of Tulsa, 800 S. Tucker Drive, Tulsa, OK 74104. E-mail: brett.mckinney@gmail.com

    Search for more papers by this author

ABSTRACT

Open source tools are needed to facilitate the construction, analysis, and visualization of gene-gene interaction networks for sequencing data. To address this need, we present Encore, an open source network analysis pipeline for genome-wide association studies and rare variant data. Encore constructs Genetic Association Interaction Networks or epistasis networks using two optional approaches: our previous information-theory method or a generalized linear model approach. Additionally, Encore includes multiple data filtering options, including Random Forest/Random Jungle for main effect enrichment and Evaporative Cooling and Relief-F filters for enrichment of interaction effects. Encore implements SNPrank network centrality for identifying susceptibility hubs (nodes containing a large amount of disease susceptibility information through the combination of multivariate main effects and multiple gene-gene interactions in the network), and it provides appropriate files for interactive visualization of a network using tools from our online Galaxy instance. We implemented these algorithms in C++ using OpenMP for shared-memory parallel analysis on a server or desktop. To demonstrate Encore's utility in analysis of genetic sequencing data, we present an analysis of exome resequencing data from healthy individuals and those with Systemic Lupus Erythematous (SLE). Our results verify the importance of the previously associated SLE genes HLA-DRB and NCF2, and these two genes had the highest gene-gene interaction degrees among the susceptibility hubs. An additional 14 genes previously associated with SLE emerged in our epistasis network model of the exome data, and three novel candidate genes, ST8SIA4, CMTM4, and C2CD4B, were implicated in the model. In summary, we present a comprehensive tool for epistasis network analysis and the first such analysis of exome data from a genetic study of SLE.

Ancillary