Research Article
Evaluating technologies for classification and prediction in medicine
Article first published online: 30 NOV 2005
DOI: 10.1002/sim.2431
Copyright © 2005 John Wiley & Sons, Ltd.
Issue
1097-0258/asset/cover.gif?v=1&s=64ebf4a6597e744f418c952845cddf175ccc795f)
Statistics in Medicine
Special Issue: Papers from the 25th Annual Conference of the International Society for Clinical Biostatistics
Volume 24, Issue 24, pages 3687–3696, 30 December 2005
Additional Information
How to Cite
Pepe, M. S. (2005), Evaluating technologies for classification and prediction in medicine. Statistics in Medicine, 24: 3687–3696. doi: 10.1002/sim.2431
Publication History
- Issue published online: 30 NOV 2005
- Article first published online: 30 NOV 2005
- Manuscript Accepted: 6 SEP 2005
- Manuscript Received: 17 AUG 2005
Funded by
- National Institutes of Health. Grant Numbers: U01CA86368, R01GM54438
- Abstract
- References
- Cited By
Keywords:
- diagnostic test;
- receiver operating characteristic;
- odds ratio;
- disease screening;
- prognosis
Abstract
Modern technologies promise to provide new ways of diagnosing disease, detecting subclinical disease, predicting prognosis, selecting patient specific treatment, identifying subjects at risk for disease, and so forth. Advances in genomics, proteomics and imaging modalities in particular hold great potential for assisting with classification/prediction in medicine. Before a classifier can be adopted for routine use in health care, its classification accuracy must be determined. Standards for evaluating new clinical classifiers however, lag far behind the well established standards that exist for evaluating new clinical treatments.
In this paper, we discuss a phased approach to developing a new classifier (or biomarker). It mirrors the internationally established phase 1–2–3 paradigm for therapeutic drugs. The defined phases lead to a logical sequence of studies for classifier development. We emphasize that evaluating classification accuracy is fundamentally different from simply establishing association with outcome. Therefore, study objectives and designs differ from the familiar methods of clinical trials. We discuss these briefly for each phase.
Finally, we argue that classifier development requires some rethinking of traditional data analysis techniques. As an example we show that maximizing the likelihood function to fit a logistic regression model to multiple predictors, can yield a poor classifier. Instead we demonstrate that an approach that maximizes an alternative objective function characterizing classification accuracy performs better. Copyright © 2005 John Wiley & Sons, Ltd.

1097-0258/asset/SIM_left.gif?v=1&s=1b631772c3897aa95941da3609d901cd1d389e83)
1097-0258/asset/olbannerright.gif?v=1&s=6d257623b3308a7485294c87b3b5e1e665484099)