Using pseudo amino acid composition to predict protein structural classes: Approached with complexity measure factor

Authors

  • Xuan Xiao,

    1. Institute of Information, Donghua University, Shanghai 200051, People's Republic of China
    2. Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 33300, People's Republic of China
    Search for more papers by this author
  • Shi-Huang Shao,

    1. Institute of Information, Donghua University, Shanghai 200051, People's Republic of China
    Search for more papers by this author
  • Zheng-De Huang,

    1. Institute of Information, Donghua University, Shanghai 200051, People's Republic of China
    Search for more papers by this author
  • Kuo-Chen Chou

    Corresponding author
    1. Institute of Information, Donghua University, Shanghai 200051, People's Republic of China
    2. Gordon Life Science Institute, 13784 Torrey Del Mar Drive, San Diego, California 92130
    • Institute of Information, Donghua University, Shanghai 200051, People's Republic of China
    Search for more papers by this author

Abstract

The structural class is an important feature widely used to characterize the overall folding type of a protein. How to improve the prediction quality for protein structural classification by effectively incorporating the sequence-order effects is an important and challenging problem. Based on the concept of the pseudo amino acid composition [Chou, K. C. Proteins Struct Funct Genet 2001, 43, 246; Erratum: Proteins Struct Funct Genet 2001, 44, 60], a novel approach for measuring the complexity of a protein sequence was introduced. The advantage by incorporating the complexity measure factor into the pseudo amino acid composition as one of its components is that it can catch the essence of the overall sequence pattern of a protein and hence more effectively reflect its sequence-order effects. It was demonstrated thru the jackknife crossvalidation test that the overall success rate by the new approach was significantly higher than those by the others. It has not escaped our notice that the introduction of the complexity measure factor can also be used to improve the prediction quality for, among many other protein attributes, subcellular localization, enzyme family class, membrane protein type, and G-protein couple receptor type. © 2006 Wiley Periodicals, Inc. J Comput Chem 27: 478–482, 2006

Ancillary