Volume 38, Issue 13
TUTORIAL IN BIOSTATISTICS

Evaluating classification accuracy for modern learning approaches

Jialiang Li

Corresponding Author

E-mail address: stalj@nus.edu.sg

Department of Statistics and Applied Probability, National University of Singapore, Singapore

Duke University‐NUS Graduate Medical School, Singapore

Singapore Eye Research Institute, Singapore

Jialiang Li, Department of Statistics and Applied Probability, National University of Singapore, Singapore; Duke University‐NUS Graduate Medical School, Singapore; or Singapore Eye Research Institute, Singapore.

Email: stalj@nus.edu.sg

Search for more papers by this author
Ming Gao

Department of Mathematics, Shanghai Jiao Tong University, Shanghai, China

Department of Statistics, University of Michigan, Ann Arbor, Michigan

Search for more papers by this author
Ralph D'Agostino

Department of Mathematics and Statistics, Boston University, Boston, Massachusetts

Search for more papers by this author
First published: 30 January 2019
Citations: 7
Jialiang Li, 6 Science Drive 2, Singapore 117546.

Abstract

Deep learning neural network models such as multilayer perceptron (MLP) and convolutional neural network (CNN) are novel and attractive artificial intelligence computing tools. However, evaluation of the performance of these methods is not readily available for practitioners yet. We provide a tutorial for evaluating classification accuracy for various state‐of‐the‐art learning approaches, including familiar shallow and deep learning methods. For qualitative response variables with more than two categories, many traditional accuracy measures such as sensitivity, specificity, and area under the receiver operating characteristic curve are not applicable and we have to consider their extensions properly. In this paper, a few important statistical concepts for multicategory classification accuracy are reviewed and their utilities for various learning algorithms are demonstrated with real medical examples. We offer problem‐based R code to illustrate how to perform these statistical computations step by step. We expect that such analysis tools will become more familiar to practitioners and receive broader applications in biostatistics.

Number of times cited according to CrossRef: 7

  • A decision support system based on support vector machine for diagnosis of periodontal disease, BMC Research Notes, 10.1186/s13104-020-05180-5, 13, 1, (2020).
  • Statistical inference for decision curve analysis, with applications to cataract diagnosis, Statistics in Medicine, 10.1002/sim.8588, 39, 22, (2980-3002), (2020).
  • Using pharmacy dispensing data to predict falls in older individuals, British Journal of Clinical Pharmacology, 10.1111/bcp.14506, 0, 0, (2020).
  • Automatic Triage of 12‐Lead ECGs Using Deep Convolutional Neural Networks, Journal of the American Heart Association, 10.1161/JAHA.119.015138, (2020).
  • A Perspective from a Case Conference on Comparing the Diagnostic Process: Human Diagnostic Thinking vs. Artificial Intelligence (AI) Decision Support Tools, International Journal of Environmental Research and Public Health, 10.3390/ijerph17176110, 17, 17, (6110), (2020).
  • Adaptation of the prostate biopsy collaborative group risk calculator in patients with PSA less than 10 ng/ml improves its performance, International Urology and Nephrology, 10.1007/s11255-020-02517-8, (2020).
  • Non-monotone transformation of biomarkers to improve diagnostic and screening accuracy in a DNA methylation study with trichotomous phenotypes, Statistical Methods in Medical Research, 10.1177/0962280219882047, (096228021988204), (2019).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.