Seed-based systematic discovery of specific transcription factor target genes

Authors

  • Ralf Mrowka,

    1.  Paul-Ehrlich-Zentrum für Experimentelle Medizin, Berlin, Germany
    2.  AG Systems Biology – Computational Physiology, Berlin, Germany
    3.  Johannes-Müller-Institut für Physiologie, Charité-Universitätsmedizin Berlin, Germany
    Search for more papers by this author
  • Nils Blüthgen,

    1.  School of Chemical Engineering and Analytical Sciences, Manchester Interdisciplinary Biocentre, University of Manchester, UK
    Search for more papers by this author
  • Michael Fähling

    1.  Paul-Ehrlich-Zentrum für Experimentelle Medizin, Berlin, Germany
    2.  Johannes-Müller-Institut für Physiologie, Charité-Universitätsmedizin Berlin, Germany
    Search for more papers by this author

R. Mrowka, Paul-Ehrlich-Zentrum für Experimentelle Medizin, AG Systems Biology – Computational Physiology, Tucholskystr. 2, D-10117 Berlin, Germany
Fax: +49 30 450528972
Tel: +49 30 450528218
E-mail: ralf.mrowka@charite.de

Abstract

Reliable prediction of specific transcription factor target genes is a major challenge in systems biology and functional genomics. Current sequence-based methods yield many false predictions, due to the short and degenerated DNA-binding motifs. Here, we describe a new systematic genome-wide approach, the seed-distribution-distance method, that searches large-scale genome-wide expression data for genes that are similarly expressed as known targets. This method is used to identify genes that are likely targets, allowing sequence-based methods to focus on a subset of genes, giving rise to fewer false-positive predictions. We show by cross-validation that this method is robust in recovering specific target genes. Furthermore, this method identifies genes with typical functions and binding motifs of the seed. The method is illustrated by predicting novel targets of the transcription factor nuclear factor kappaB (NF-κB). Among the new targets is optineurin, which plays a key role in the pathogenesis of acquired blindness caused by adult-onset primary open-angle glaucoma. We show experimentally that the optineurin gene and other predicted genes are targets of NF-κB. Thus, our data provide a missing link in the signalling of NF-κB and the damping function of optineurin in signalling feedback of NF-κB. We present a robust and reliable method to enhance the genome-wide prediction of specific transcription factor target genes that exploits the vast amount of expression information available in public databases today.

Ancillary