Get access
Advertisement

Measurement of data complexity for classification problems with unbalanced data

Authors


Abstract

We introduce a complexity measure for classification problems that takes account of deterioration in classifier performance as a result of class imbalance. The measure is based on k-nearest neighbors. We explore the choices of k and the distance metric through a simulation study, and illustrate the use of our measure, and related data visualization techniques, with real datasets from the literature.

Get access to the full text of this article

Ancillary