Predicting protein folding rates using the concept of Chou's pseudo amino acid composition

Authors

  • Jianxiu Guo,

    1. School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
    Search for more papers by this author
  • Nini Rao,

    Corresponding author
    1. School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
    • School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
    Search for more papers by this author
  • Guangxiong Liu,

    1. School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
    Search for more papers by this author
  • Yong Yang,

    1. School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
    Search for more papers by this author
  • Gang Wang

    1. School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China
    Search for more papers by this author

Abstract

One of the most important challenges in computational and molecular biology is to understand the relationship between amino acid sequences and the folding rates of proteins. Recent works suggest that topological parameters, amino acid properties, chain length and the composition index relate well with protein folding rates, however, sequence order information has seldom been considered as a property for predicting protein folding rates. In this study, amino acid sequence order was used to derive an effective method, based on an extended version of the pseudo-amino acid composition, for predicting protein folding rates without any explicit structural information. Using the jackknife cross validation test, the method was demonstrated on the largest dataset (99 proteins) reported. The method was found to provide a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.81 (with a highly significant level) and the standard error is 2.46. The reported algorithm was found to perform better than several representative sequence-based approaches using the same dataset. The results indicate that sequence order information is an important determinant of protein folding rates. ©2011 Wiley Periodicals, Inc. J Comput Chem 2011.

Ancillary