SEARCH

SEARCH BY CITATION

Keywords:

  • machine learning;
  • data mining;
  • cost-sensitive learning;
  • multi-class problems;
  • rescaling;
  • class-imbalance learning

Rescaling is possibly the most popular approach to cost-sensitive learning. This approach works by rebalancing the classes according to their costs, and it can be realized in different ways, for example, re-weighting or resampling the training examples in proportion to their costs, moving the decision boundaries of classifiers faraway from high-cost classes in proportion to costs, etc. This approach is very effective in dealing with two-class problems, yet some studies showed that it is often not so helpful on multi-class problems. In this article, we try to explore why the rescaling approach is often helpless on multi-class problems. Our analysis discloses that the rescaling approach works well when the costs are consistent, while directly applying it to multi-class problems with inconsistent costs may not be a good choice. Based on this recognition, we advocate that before applying the rescaling approach, the consistency of the costs must be examined at first. If the costs are consistent, the rescaling approach can be conducted directly; otherwise it is better to apply rescaling after decomposing the multi-class problem into a series of two-class problems. An empirical study involving 20 multi-class data sets and seven types of cost-sensitive learners validates our proposal. Moreover, we show that the proposal is also helpful for class-imbalance learning.