Get access

A Hierarchical Semiparametric Model for Incorporating Intergene Information for Analysis of Genomic Data

Authors

  • Long Qu,

    Corresponding author
    1. BioStat Solutions, Inc., 114 South Main Street Suite 2, Mount Airy, Maryland 21771, U.S.A.
      email: lqu@biostatsolutions.com
    Search for more papers by this author
  • Dan Nettleton,

    Corresponding author
    1. Department of Statistics, Iowa State University, 2115 Snedecor Hall, Ames, Iowa 50011, U.S.A.
      email: dnett@iastate.edu
    Search for more papers by this author
  • Jack C. M. Dekkers

    Corresponding author
    1. Department of Animal Science, Iowa State University, 239D Kildee Hall, Ames, Iowa 50011, U.S.A.
      email: jdekkers@iastate.edu
    Search for more papers by this author

email: lqu@biostatsolutions.com

email:dnett@iastate.edu

email:jdekkers@iastate.edu

Abstract

Summary For analysis of genomic data, e.g., microarray data from gene expression profiling experiments, the two-component mixture model has been widely used in practice to detect differentially expressed genes. However, it naïvely imposes strong exchangeability assumptions across genes and does not make active use of a priori information about intergene relationships that is currently available, e.g., gene annotations through the Gene Ontology (GO) project. We propose a general strategy that first generates a set of covariates that summarizes the intergene information and then extends the two-component mixture model into a hierarchical semiparametric model utilizing the generated covariates through latent nonparametric regression. Simulations and analysis of real microarray data show that our method can outperform the naïve two-component mixture model.

Get access to the full text of this article

Ancillary