SEARCH

SEARCH BY CITATION

Keywords:

  • multiclass classifier;
  • classification performance;
  • classification accuracy;
  • Cohen's kappa;
  • AUC;
  • sum of ranking differences

Classification problems are very important, and generally, the question is which is the best model. Several classification performance indicators including the classification accuracy value (ACC), Cohen's kappa (KAPPA), or the area under the ROC curve (AUC) are used to answer this question. There are non-parametric comparative methods such as the sum of ranking differences method. The objective of this work was to find the best classification method to classify four soft drink samples and four model samples, which differ from each other only in the sweetener composition. Model samples were used to be basic samples for comparison with the commercial soft drinks. Six different classification methods were compared according to their classification performance. A corrected classification accuracy value (corrected ACC) was developed for the purpose and was introduced. This value takes into account the similarities between the classes. The results showed that the ACC value and the KAPPA values give similar results in our case. The best three models according to the ACC, KAPPA, and AUC were “K-nearest neighbor,” “random forest,” and “discriminant analysis.” However, the corrected ACC value showed a bit different ranking, and the random forest model was neglected from the good models. The confusion matrices of the models confirmed the ranking according to the corrected ACC value. The results showed that the best classification model was the K-nearest neighbor for the available samples, and the corrected ACC value is a useful classification performance indicator. Copyright © 2012 John Wiley & Sons, Ltd.