Volume 18, Issue 12
Research Article

Building classification trees using the total uncertainty criterion

Joaquín Abellán

Corresponding Author

E-mail address: jabemu@teleline.es

Departamento Ciencias de la Computación e Inteligencia Artificial, ETSI Informática, Universidad de Granada 18071 Granada, Spain

Departamento Ciencias de la Computación e Inteligencia Artificial, ETSI Informática, Universidad de Granada 18071 Granada, SpainSearch for more papers by this author
Serafín Moral

E-mail address: smc@decsai.ugr.es

Departamento Ciencias de la Computación e Inteligencia Artificial, ETSI Informática, Universidad de Granada 18071 Granada, Spain

Search for more papers by this author
First published: 02 December 2003
Citations: 67

Abstract

We present an application of the measure of total uncertainty on convex sets of probability distributions, also called credal sets, to the construction of classification trees. In these classification trees the probabilities of the classes in each one of its leaves is estimated by using the imprecise Dirichlet model. In this way, smaller samples give rise to wider probability intervals. Branching a classification tree can decrease the entropy associated with the classes but, at the same time, as the sample is divided among the branches the nonspecificity increases. We use a total uncertainty measure (entropy + nonspecificity) as branching criterion. The stopping rule is not to increase the total uncertainty. The good behavior of this procedure for the standard classification problems is shown. It is important to remark that it does not experience of overfitting, with similar results in the training and test samples. © 2003 Wiley Periodicals, Inc.

Number of times cited according to CrossRef: 67

  • Evaluation of tree-base data mining algorithms in land used/land cover mapping in a semi-arid environment through Landsat 8 OLI image; Shiraz, Iran, Geomatics, Natural Hazards and Risk, 10.1080/19475705.2020.1745902, 11, 1, (724-741), (2020).
  • Imprecise weighted extensions of random forests for classification and regression, Applied Soft Computing, 10.1016/j.asoc.2020.106324, (106324), (2020).
  • Imprecise Classification with Non-parametric Predictive Inference, Information Processing and Management of Uncertainty in Knowledge-Based Systems, 10.1007/978-3-030-50143-3_5, (53-66), (2020).
  • GIS-based ensemble soft computing models for landslide susceptibility mapping, Advances in Space Research, 10.1016/j.asr.2020.05.016, (2020).
  • A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov–Smirnov bounds, Neural Networks, 10.1016/j.neunet.2020.08.007, (2020).
  • An Empirical Investigation of Different Classifiers, Encoding, and Ensemble Schemes for Next Event Prediction Using Business Process Event Logs, ACM Transactions on Intelligent Systems and Technology, 10.1145/3406541, 11, 6, (1-34), (2020).
  • Hybrid Computational Intelligence Methods for Landslide Susceptibility Mapping, Symmetry, 10.3390/sym12030325, 12, 3, (325), (2020).
  • Improvement of Credal Decision Trees Using Ensemble Frameworks for Groundwater Potential Modeling, Sustainability, 10.3390/su12072622, 12, 7, (2622), (2020).
  • GIS Based Hybrid Computational Approaches for Flash Flood Susceptibility Assessment, Water, 10.3390/w12030683, 12, 3, (683), (2020).
  • An empirical comparison of classification techniques for next event prediction using business process event logs, Expert Systems with Applications, 10.1016/j.eswa.2019.04.016, 129, (233-245), (2019).
  • undefined, , 10.1063/1.5114591, (020011), (2019).
  • An Imprecise Deep Forest for Classification, Expert Systems with Applications, 10.1016/j.eswa.2019.112978, (112978), (2019).
  • Bagging of Credal Decision Trees for Imprecise Classification, Expert Systems with Applications, 10.1016/j.eswa.2019.112944, (112944), (2019).
  • Investigating autism etiology and heterogeneity by decision tree algorithm, Informatics in Medicine Unlocked, 10.1016/j.imu.2019.100215, (100215), (2019).
  • Non-parametric predictive inference for solving multi-label classification, Applied Soft Computing, 10.1016/j.asoc.2019.106011, (106011), (2019).
  • Novel Entropy and Rotation Forest-Based Credal Decision Tree Classifier for Landslide Susceptibility Modeling, Entropy, 10.3390/e21020106, 21, 2, (106), (2019).
  • Decision Tree Ensemble Method for Analyzing Traffic Accidents of Novice Drivers in Urban Areas, Entropy, 10.3390/e21040360, 21, 4, (360), (2019).
  • Ensemble of classifier chains and Credal C4.5 for solving multi-label classification, Progress in Artificial Intelligence, 10.1007/s13748-018-00171-x, (2019).
  • Enhanced Label Noise Filtering with Multiple Voting, Applied Sciences, 10.3390/app9235031, 9, 23, (5031), (2019).
  • Increasing diversity in random forest learning algorithm via imprecise probabilities, Expert Systems with Applications, 10.1016/j.eswa.2017.12.029, 97, (228-243), (2018).
  • AdaptativeCC4.5: Credal C4.5 with a rough class noise estimator, Expert Systems with Applications, 10.1016/j.eswa.2017.09.057, 92, (363-379), (2018).
  • Credal C4.5 with Refinement of Parameters, Information Processing and Management of Uncertainty in Knowledge-Based Systems. Applications, 10.1007/978-3-319-91479-4_61, (739-747), (2018).
  • Using Credal-C4.5 with Binary Relevance for Multi-Label Classification, Journal of Intelligent & Fuzzy Systems, 10.3233/JIFS-18746, (1-12), (2018).
  • undefined, 2018 Chinese Automation Congress (CAC), 10.1109/CAC.2018.8623044, (3104-3108), (2018).
  • City-wide building height determination using light detection and ranging data, Environment and Planning B: Urban Analytics and City Science, 10.1177/2399808318774336, (239980831877433), (2018).
  • A comparison of random forest based algorithms: random credal random forest versus oblique random forest, Soft Computing, 10.1007/s00500-018-3628-5, (2018).
  • A Random Forest approach using imprecise probabilities, Knowledge-Based Systems, 10.1016/j.knosys.2017.07.019, 134, (72-84), (2017).
  • A comparative study on base classifiers in ensemble methods for credit scoring, Expert Systems with Applications, 10.1016/j.eswa.2016.12.020, 73, (1-10), (2017).
  • A New Robust Classifier on Noise Domains: Bagging of Credal C4.5 Trees, Complexity, 10.1155/2017/9023970, 2017, (1-17), (2017).
  • Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy, Entropy, 10.3390/e19060247, 19, 6, (247), (2017).
  • A Novel Building Type Classification Scheme Based on Integrated LiDAR and High-Resolution Images, Remote Sensing, 10.3390/rs9070679, 9, 7, (679), (2017).
  • Novel mislabeled training data detection algorithm, Neural Computing and Applications, 10.1007/s00521-016-2589-9, 29, 10, (673-683), (2016).
  • Analysis of Credal-C4.5 for classification in noisy domains, Expert Systems with Applications, 10.1016/j.eswa.2016.05.035, 61, (314-326), (2016).
  • A biclustering approach for classification with mislabeled data, Expert Systems with Applications, 10.1016/j.eswa.2015.02.045, 42, 12, (5065-5075), (2015).
  • Classification in the Presence of Label Noise: A Survey, IEEE Transactions on Neural Networks and Learning Systems, 10.1109/TNNLS.2013.2292894, 25, 5, (845-869), (2014).
  • Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring, Expert Systems with Applications, 10.1016/j.eswa.2013.12.003, 41, 8, (3825-3830), (2014).
  • undefined, 2014 International Conference on Multimedia Computing and Systems (ICMCS), 10.1109/ICMCS.2014.6911187, (438-443), (2014).
  • Analysis and extension of decision trees based on imprecise probabilities: Application on noisy data, Expert Systems with Applications, 10.1016/j.eswa.2013.09.050, 41, 5, (2514-2525), (2014).
  • Classification with decision trees from a nonparametric predictive inference perspective, Computational Statistics & Data Analysis, 10.1016/j.csda.2013.02.009, 71, (789-802), (2014).
  • Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data, Expert Systems with Applications, 10.1016/j.eswa.2014.01.017, 41, 10, (4625-4637), (2014).
  • Credal Decision Trees to Classify Noisy Data Sets, Hybrid Artificial Intelligence Systems, 10.1007/978-3-319-07617-1_60, (689-696), (2014).
  • Using Imprecise Probabilities to Extract Decision Rules via Decision Trees for Analysis of Traffic Accidents, Rough Sets and Current Trends in Soft Computing, 10.1007/978-3-319-08644-6_30, (288-298), (2014).
  • References, Introduction to Imprecise Probabilities, undefined, (338-373), (2014).
  • Ensembles of decision trees based on imprecise probabilities and uncertainty measures, Information Fusion, 10.1016/j.inffus.2012.03.003, 14, 4, (423-430), (2013).
  • Analysis of traffic accident severity using Decision Rules via Decision Trees, Expert Systems with Applications, 10.1016/j.eswa.2013.05.027, 40, 15, (6047-6054), (2013).
  • An application of Non-Parametric Predictive Inference on multi-class classification high-level-noise problems, Expert Systems with Applications, 10.1016/j.eswa.2013.01.066, 40, 11, (4585-4592), (2013).
  • Equivalence relations among dominance concepts on probability intervals and general credal sets, International Journal of General Systems, 10.1080/03081079.2011.607449, 41, 2, (109-122), (2012).
  • IMPRECISE CLASSIFICATION WITH CREDAL DECISION TREES, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10.1142/S0218488512500353, 20, 05, (763-787), (2012).
  • Bagging schemes on the presence of class noise in classification, Expert Systems with Applications, 10.1016/j.eswa.2012.01.013, 39, 8, (6827-6837), (2012).
  • TWO GENERALIZATIONS OF AGGREGATED UNCERTAINTY MEASURE FOR EVALUATION OF DEZERT–SMARANDACHE THEORY, International Journal of Information Technology & Decision Making, 10.1142/S021962201250006X, 11, 01, (119-142), (2012).
  • Evaluating credal classifiers by utility-discounted predictive accuracy, International Journal of Approximate Reasoning, 10.1016/j.ijar.2012.06.022, 53, 8, (1282-1301), (2012).
  • Maximising entropy on the nonparametric predictive inference model for multinomial data, European Journal of Operational Research, 10.1016/j.ejor.2011.01.020, 212, 1, (112-122), (2011).
  • Handling bipolar knowledge with imprecise probabilities, International Journal of Intelligent Systems, 10.1002/int.20475, 26, 5, (426-443), (2011).
  • A FILTER-WRAPPER METHOD TO SELECT VARIABLES FOR THE NAIVE BAYES CLASSIFIER BASED ON CREDAL DECISION TREES, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10.1142/S0218488509006297, 17, 06, (833-854), (2011).
  • An ensemble method using credal decision trees, European Journal of Operational Research, 10.1016/j.ejor.2009.12.003, 205, 1, (218-226), (2010).
  • Bagging Decision Trees on Data Sets with Classification Noise, Foundations of Information and Knowledge Systems, 10.1007/978-3-642-11829-6_17, (248-265), (2010).
  • Supervised classification using probabilistic decision graphs, Computational Statistics & Data Analysis, 10.1016/j.csda.2008.11.003, 53, 4, (1299-1311), (2009).
  • An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 10.1007/978-3-642-02906-6_39, (446-456), (2009).
  • Requirements for total uncertainty measures in Dempster–Shafer theory of evidence, International Journal of General Systems, 10.1080/03081070802082486, 37, 6, (733-747), (2008).
  • Evolution and challenges in the design of computational systems for triage assistance, Journal of Biomedical Informatics, 10.1016/j.jbi.2008.01.007, 41, 3, (432-441), (2008).
  • Split Criterions for Variable Selection Using Decision Trees, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 10.1007/978-3-540-75256-1_44, (489-500), (2007).
  • Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 10.1007/978-3-540-75256-1_46, (512-523), (2007).
  • Probabilistic interval XML, ACM Transactions on Computational Logic, 10.1145/1276920.1276926, 8, 4, (24), (2007).
  • Uncertainty measures on probability intervals from the imprecise Dirichlet model, International Journal of General Systems, 10.1080/03081070600687643, 35, 5, (509-528), (2006).
  • Measures of divergence on credal sets, Fuzzy Sets and Systems, 10.1016/j.fss.2005.11.021, 157, 11, (1514-1531), (2006).
  • Varying Parameter in Classification Based on Imprecise Probabilities, Soft Methods for Integrated Uncertainty Modelling, 10.1007/3-540-34777-1, (231-239), (2006).
  • An introduction to the imprecise Dirichlet model for multinomial data, International Journal of Approximate Reasoning, 10.1016/j.ijar.2004.10.002, 39, 2-3, (123-150), (2005).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.