Prediction of Disorder
Predicting intrinsic disorder from amino acid sequence
Article first published online: 15 OCT 2003
DOI: 10.1002/prot.10532
Copyright © 2003 Wiley-Liss, Inc.
Issue
1097-0134/asset/cover.gif?v=1&s=d817e79b67ba6cacf8bdcce1a819c04de300a7e3)
Proteins: Structure, Function, and Bioinformatics
Supplement: Fifth Meeting on the Critical Assessment of Techniques for Protein Structure Prediction
Volume 53, Issue Supplement 6, pages 566–572, 2003
Additional Information
How to Cite
Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., Brown, C. J. and Dunker, A. K. (2003), Predicting intrinsic disorder from amino acid sequence. Proteins, 53: 566–572. doi: 10.1002/prot.10532
Publication History
- Issue published online: 15 OCT 2003
- Article first published online: 15 OCT 2003
- Manuscript Accepted: 17 JUN 2003
- Manuscript Received: 17 FEB 2003
Funded by
- National Institutes of Health. Grant Number: 1R01 LM06916
- National Science Foundation. Grant Numbers: CSE-IIS-971153, CSE-IIS-0196237
- Amgen, Inc. of Thousand Oaks, CA
- Abstract
- Article
- References
- Cited By
Keywords:
- natively unfolded;
- intrinsically disordered;
- neural networks;
- ordinary least squares regression;
- machine learning
Abstract
Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. Proteins 2003;53:566–572. © 2003 Wiley-Liss, Inc.

1097-0134/asset/PROT_centre.gif?v=1&s=77b56b1f2cdaba74cb3bb149bd9b029cd8803cdb)