Get access

Evolutionary design of decision trees



Decision tree (DT) is one of the most popular symbolic machine learning approaches to classification with a wide range of applications. Decision trees are especially attractive in data mining. It has an intuitive representation and is, therefore, easy to understand and interpret, also by nontechnical experts. The most important and critical aspect of DTs is the process of their construction. Several induction algorithms exist that use the recursive top-down principle to divide training objects into subgroups based on different statistical measures in order to achieve homogeneous subgroups. Although being robust and fast, generally providing good results, their deterministic and heuristic nature can lead to suboptimal solutions. Therefore, alternative approaches have developed which try to overcome the drawbacks of classical induction. One of the most viable approaches seems to be the use of evolutionary algorithms, which can produce better DTs as they are searching for globally optimal solutions, evaluating potential solutions with regard to different criteria. We review the process of evolutionary design of DTs, providing the description of the most common approaches as well as referring to recognized specializations. The overall process is first explained and later demonstrated in a step-by-step case study using a dataset from the University of California, Irvine (UCI) machine learning repository. © 2012 Wiley Periodicals, Inc.