Supersecondary structure prediction using Chou's pseudo amino acid composition



Supersecondary structures (SSSs) are the building blocks of protein 3D structures. Accurate prediction of SSSs can be one important step toward building a tertiary structure from the specified secondary structure. How to improve the accuracy of prediction of SSSs by effectively incorporating the sequence order effects is an important and challenging problem. Based on a different form of Chou's pseudo amino acid composition, a novel approach for feature representation of SSSs is proposed. Amino acid basic compositions, dipeptide components, and amino acid composition distribution are incorporated to represent the compositional features of proteins. Each supersecondary structural motif is characterized as a vector of 36 dimensions. In addition, we propose a novel prediction system by using SVM and IDQD algorithm as classifiers. Our method is trained and tested on ArchDB40 dataset containing 3088 proteins. The highest overall accuracy for the training dataset and the independent testing dataset are 77.7 and 69.4%, respectively. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011