Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles

Authors

  • Gianluca Pollastri,

    1. Department of Information and Computer Science, Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California
    Search for more papers by this author
  • Darisz Przybylski,

    1. Department of Biochemistry and Molecular Biophysics, CUBIC, Columbia University, New York, New York
    Search for more papers by this author
  • Burkhard Rost,

    1. Department of Biochemistry and Molecular Biophysics, CUBIC, Columbia University, New York, New York
    Search for more papers by this author
  • Pierre Baldi

    Corresponding author
    1. Department of Information and Computer Science, Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California
    • Department of Information and Computer Science, Institute for Genomic and Bioinformatics, University of California, Irvine, Irvine, CA 92697-3425
    Search for more papers by this author

Abstract

Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein structure and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSI-BLAST-derived profiles, and a large nonredundant training set to derive two new predictors: (a) the second version of the SSpro program for secondary structure classification into three categories and (b) the first version of the SSpro8 program for secondary structure classification into the eight classes produced by the DSSP program. We describe the results of three different test sets on which SSpro achieved a sustained performance of about 78% correct prediction. We report confusion matrices, compare PSI-BLAST to BLAST-derived profiles, and assess the corresponding performance improvements. SSpro and SSpro8 are implemented as web servers, available together with other structural feature predictors at: http://promoter.ics.uci.edu/BRNN-PRED/. Proteins 2002;47:228–235. © 2002 Wiley-Liss, Inc.

Ancillary