Research Article
Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set
Article first published online: 20 AUG 2009
DOI: 10.1002/sim.3707
Copyright © 2009 John Wiley & Sons, Ltd.
Additional Information
How to Cite
Archer, K. J. and Mas, V. R. (2009), Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set. Statistics in Medicine, 28: 3597–3610. doi: 10.1002/sim.3707
Publication History
- Issue published online: 20 NOV 2009
- Article first published online: 20 AUG 2009
- Manuscript Accepted: 20 JUL 2009
- Manuscript Received: 16 JAN 2009
Funded by
- National Institutes of Health/National Institutes of Library Medicine. Grant Number: 1R03LM009347-01A2
- National Institutes of Diabetes and Digestive and Kidney Diseases. Grant Number: DK069859
- Abstract
- References
- Cited By
Keywords:
- ordinal response;
- classification trees;
- machine learning;
- bootstrap aggregating;
- gene expression
Abstract
Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set. Copyright © 2009 John Wiley & Sons, Ltd.

1097-0258/asset/SIM_left.gif?v=1&s=1b631772c3897aa95941da3609d901cd1d389e83)
1097-0258/asset/olbannerright.gif?v=1&s=6d257623b3308a7485294c87b3b5e1e665484099)
1097-0258/asset/cover.gif?v=1&s=64ebf4a6597e744f418c952845cddf175ccc795f)