SEARCH

SEARCH BY CITATION

Keywords:

  • class imbalance;
  • complexity measurement;
  • nearest neighbors;
  • Bayes error

Abstract

We introduce a complexity measure for classification problems that takes account of deterioration in classifier performance as a result of class imbalance. The measure is based on k-nearest neighbors. We explore the choices of k and the distance metric through a simulation study, and illustrate the use of our measure, and related data visualization techniques, with real datasets from the literature.