Prediction and validation of the unexplored RNA-binding protein atlas of the human proteome

Authors

  • Huiying Zhao,

    1. School of Informatics and Computing, Indiana University Purdue University, Indianapolis, Indiana
    2. Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana
    Search for more papers by this author
  • Yuedong Yang,

    1. School of Informatics and Computing, Indiana University Purdue University, Indianapolis, Indiana
    2. Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana
    3. Institute for Glycomics and School of Informatics and Communication Technology, Griffith University, Southport, Queensland, Australia
    Search for more papers by this author
  • Sarath Chandra Janga,

    1. School of Informatics and Computing, Indiana University Purdue University, Indianapolis, Indiana
    2. Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana
    Search for more papers by this author
  • C. Cheng Kao,

    1. Department of Molecular and Cellular Biochemistry, Indiana University, Bloomington, Indiana
    Search for more papers by this author
  • Yaoqi Zhou

    Corresponding author
    1. School of Informatics and Computing, Indiana University Purdue University, Indianapolis, Indiana
    2. Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana
    3. Institute for Glycomics and School of Informatics and Communication Technology, Griffith University, Southport, Queensland, Australia
    • Correspondence to: Yaoqi Zhou, School of Informatics, Indiana University Purdue University, Indianapolis, Indiana, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 719 Indiana Ave Ste 319, Walker Plaza Building, Indianapolis, IN 46202. E-mail: yaoqi.zhou@griffith.edu.au

    Search for more papers by this author

  • Huiying Zhao and Yuedong Yang contributed equally to this work.

ABSTRACT

Detecting protein-RNA interactions is challenging both experimentally and computationally because RNAs are large in number, diverse in cellular location and function, and flexible in structure. As a result, many RNA-binding proteins (RBPs) remain to be identified. Here, a template-based, function-prediction technique SPOT-Seq for RBPs is applied to human proteome and its result is validated by a recent proteomic experimental discovery of 860 mRNA-binding proteins (mRBPs). The coverage (or sensitivity) is 42.6% for 1217 known RBPs annotated in the Gene Ontology and 43.6% for 860 newly discovered human mRBPs. Consistent sensitivity indicates the robust performance of SPOT-Seq for predicting RBPs. More importantly, SPOT-Seq detects 2418 novel RBPs in human proteome, 291 of which were validated by the newly discovered mRBP set. Among 291 validated novel RBPs, 61 are not homologous to any known RBPs. Successful validation of predicted novel RBPs permits us to further analysis of their phenotypic roles in disease pathways. The dataset of 2418 predicted novel RBPs along with confidence levels and complex structures is available at http://sparks-lab.org (in publications) for experimental confirmations and hypothesis generation. Proteins 2014; 82:640–647. © 2013 Wiley Periodicals, Inc.

Ancillary