Optimal protein-RNA area, OPRA: A propensity-based method to identify RNA-binding sites on proteins

Authors

  • Laura Pérez-Cano,

    1. Department of Life Sciences, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
    Search for more papers by this author
  • Juan Fernández-Recio

    Corresponding author
    1. Department of Life Sciences, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
    • Department of Life Sciences, Barcelona Supercomputing Center (BSC), Jordi Girona 29, Barcelona 08034, Spain
    Search for more papers by this author

  • The authors state no conflict of interest.

Abstract

Protein-RNA interactions are essential in living organisms and they are involved in very different and important cellular processes. Thus, understanding protein-RNA recognition at molecular level is a key goal not only from a basic biological point of view but also for biotechnological and therapeutic purposes. On basis of the most updated available set of nonredundant X-ray structures of protein-RNA complexes, we have computed protein-RNA interface propensities for ribonucleotides and aminoacid residues. The results show several protein residues with high tendency to bind RNA, such as arginine, lysine, and histidine. However, we could not observe any clear preferences for protein binding among the different ribonucleotides. We applied these propensity values to predict RNA-binding areas on proteins, using an ad hoc algorithm called OPRA (Optimal Protein-RNA Area). First, for each protein residue, we derived a predictive score from its corresponding protein-RNA interface propensity weighed by its accessible surface area (ASA). Then, optimal patch energy scores were computed for each residue by adding up the individual scores of the neighboring surface residues. The resulting patch scores correlate well with the known RNA-binding sites on protein surfaces. The OPRA method has been benchmarked on a test set of 30 unbound proteins involved in protein-RNA complexes of known structure, where it is able to successfully predict RNA-binding sites on protein surfaces with around 80% positive predictive value. This can be useful for identifying potential RNA-binding sites on proteins, and can help to model protein-RNA interactions of biological and therapeutic interest. Proteins 2010. © 2009 Wiley-Liss, Inc.

Ancillary