SEARCH

SEARCH BY CITATION

Keywords:

  • pseudo amino acid composition;
  • grey dynamic model;
  • grey-PseAA;
  • protein structural class;
  • covariant-discriminant algorithm;
  • chou's invariance theorem;
  • misidentification matrix

Abstract

Using the pseudo amino acid (PseAA) composition to represent the sample of a protein can incorporate a considerable amount of sequence pattern information so as to improve the prediction quality for its structural or functional classification. However, how to optimally formulate the PseAA composition is an important problem yet to be solved. In this article the grey modeling approach is introduced that is particularly efficient in coping with complicated systems such as the one consisting of many proteins with different sequence orders and lengths. On the basis of the grey model, four coefficients derived from each of the protein sequences concerned are adopted for its PseAA components. The PseAA composition thus formulated is called the “grey-PseAA” composition that can catch the essence of a protein sequence and better reflect its overall pattern. In our study we have demonstrated that introduction of the grey-PseAA composition can remarkably enhance the success rates in predicting the protein structural class. It is anticipated that the concept of grey-PseAA composition can be also used to predict many other protein attributes, such as subcellular localization, membrane protein type, enzyme functional class, GPCR type, protease type, among many others. © 2008 Wiley Periodicals, Inc. J Comput Chem 2008.