Building classification trees using the total uncertainty criterion
Abstract
We present an application of the measure of total uncertainty on convex sets of probability distributions, also called credal sets, to the construction of classification trees. In these classification trees the probabilities of the classes in each one of its leaves is estimated by using the imprecise Dirichlet model. In this way, smaller samples give rise to wider probability intervals. Branching a classification tree can decrease the entropy associated with the classes but, at the same time, as the sample is divided among the branches the nonspecificity increases. We use a total uncertainty measure (entropy + nonspecificity) as branching criterion. The stopping rule is not to increase the total uncertainty. The good behavior of this procedure for the standard classification problems is shown. It is important to remark that it does not experience of overfitting, with similar results in the training and test samples. © 2003 Wiley Periodicals, Inc.
Citing Literature
Number of times cited according to CrossRef: 67
- Hossein Moayedi, Ali Jamali, Mohamed Barakat A. Gibril, Loke Kok Foong, Mehdi Bahiraei, Evaluation of tree-base data mining algorithms in land used/land cover mapping in a semi-arid environment through Landsat 8 OLI image; Shiraz, Iran, Geomatics, Natural Hazards and Risk, 10.1080/19475705.2020.1745902, 11, 1, (724-741), (2020).
- Lev V. Utkin, Maxim S. Kovalev, Frank P.A. Coolen, Imprecise weighted extensions of random forests for classification and regression, Applied Soft Computing, 10.1016/j.asoc.2020.106324, (106324), (2020).
- Serafín Moral, Carlos J. Mantas, Javier G. Castellano, Joaquín Abellán, Imprecise Classification with Non-parametric Predictive Inference, Information Processing and Management of Uncertainty in Knowledge-Based Systems, 10.1007/978-3-030-50143-3_5, (53-66), (2020).
- Binh Thai Pham, Tran Van Phong, Trung Nguyen-Thoi, Phan Trong Trinh, Quoc Cuong Tran, Lanh Si Ho, Sushant K. Singh, Tran Thi Thanh Duyen, Loan Thi Nguyen, Huy Quang Le, Hiep Van Le, Nguyen Thi Bich Hanh, Nguyen Kim Quoc, Indra Prakash, GIS-based ensemble soft computing models for landslide susceptibility mapping, Advances in Space Research, 10.1016/j.asr.2020.05.016, (2020).
- Maxim S. Kovalev, Lev V. Utkin, A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov–Smirnov bounds, Neural Networks, 10.1016/j.neunet.2020.08.007, (2020).
- Bayu Adhi Tama, Marco Comuzzi, Jonghyeon Ko, An Empirical Investigation of Different Classifiers, Encoding, and Ensemble Schemes for Next Event Prediction Using Business Process Event Logs, ACM Transactions on Intelligent Systems and Technology, 10.1145/3406541, 11, 6, (1-34), (2020).
- Guirong Wang, Xinxiang Lei, Wei Chen, Himan Shahabi, Ataollah Shirzadi, Hybrid Computational Intelligence Methods for Landslide Susceptibility Mapping, Symmetry, 10.3390/sym12030325, 12, 3, (325), (2020).
- Phong Tung Nguyen, Duong Hai Ha, Huu Duy Nguyen, Tran Van Phong, Phan Trong Trinh, Nadhir Al-Ansari, Hiep Van Le, Binh Thai Pham, Lanh Si Ho, Indra Prakash, Improvement of Credal Decision Trees Using Ensemble Frameworks for Groundwater Potential Modeling, Sustainability, 10.3390/su12072622, 12, 7, (2622), (2020).
- Binh Thai Pham, Mohammadtaghi Avand, Saeid Janizadeh, Tran Van Phong, Nadhir Al-Ansari, Lanh Si Ho, Sumit Das, Hiep Van Le, Ata Amini, Saeid Khosrobeigi Bozchaloei, Faeze Jafari, Indra Prakash, GIS Based Hybrid Computational Approaches for Flash Flood Susceptibility Assessment, Water, 10.3390/w12030683, 12, 3, (683), (2020).
- Bayu Adhi Tama, Marco Comuzzi, An empirical comparison of classification techniques for next event prediction using business process event logs, Expert Systems with Applications, 10.1016/j.eswa.2019.04.016, 129, (233-245), (2019).
- G. Vinotha, T. V. Sundar, D. Amuthalakshmi, M. Vivek, undefined, , 10.1063/1.5114591, (020011), (2019).
- Lev V. Utkin, An Imprecise Deep Forest for Classification, Expert Systems with Applications, 10.1016/j.eswa.2019.112978, (112978), (2019).
- S. Moral-García, Carlos J. Mantas, Javier G. Castellano, María D. Benítez, Joaquín Abellán, Bagging of Credal Decision Trees for Imprecise Classification, Expert Systems with Applications, 10.1016/j.eswa.2019.112944, (112944), (2019).
- Mariam M. Hassan, Hoda MO. Mokhtar, Investigating autism etiology and heterogeneity by decision tree algorithm, Informatics in Medicine Unlocked, 10.1016/j.imu.2019.100215, (100215), (2019).
- Serafín Moral-García, Carlos J. Mantas, Javier G. Castellano, Joaquín Abellán, Non-parametric predictive inference for solving multi-label classification, Applied Soft Computing, 10.1016/j.asoc.2019.106011, (106011), (2019).
- Qingfeng He, Zhihao Xu, Shaojun Li, Renwei Li, Shuai Zhang, Nianqin Wang, Binh Thai Pham, Wei Chen, Novel Entropy and Rotation Forest-Based Credal Decision Tree Classifier for Landslide Susceptibility Modeling, Entropy, 10.3390/e21020106, 21, 2, (106), (2019).
- Serafín Moral-García, Javier Castellano, Carlos Mantas, Alfonso Montella, Joaquín Abellán, Decision Tree Ensemble Method for Analyzing Traffic Accidents of Novice Drivers in Urban Areas, Entropy, 10.3390/e21040360, 21, 4, (360), (2019).
- S. Moral-García, Carlos J. Mantas, Javier G. Castellano, Joaquín Abellán, Ensemble of classifier chains and Credal C4.5 for solving multi-label classification, Progress in Artificial Intelligence, 10.1007/s13748-018-00171-x, (2019).
- Donghai Guan, Maqbool Hussain, Weiwei Yuan, Asad Masood Khattak, Muhammad Fahim, Wajahat Ali Khan, Enhanced Label Noise Filtering with Multiple Voting, Applied Sciences, 10.3390/app9235031, 9, 23, (5031), (2019).
- Joaquín Abellán, Carlos J. Mantas, Javier G. Castellano, Serafín Moral-García, Increasing diversity in random forest learning algorithm via imprecise probabilities, Expert Systems with Applications, 10.1016/j.eswa.2017.12.029, 97, (228-243), (2018).
- Joaquín Abellán, Carlos J. Mantas, Javier G. Castellano, AdaptativeCC4.5: Credal C4.5 with a rough class noise estimator, Expert Systems with Applications, 10.1016/j.eswa.2017.09.057, 92, (363-379), (2018).
- Carlos J. Mantas, Joaquín Abellán, Javier G. Castellano, José R. Cano, Serafín Moral, Credal C4.5 with Refinement of Parameters, Information Processing and Management of Uncertainty in Knowledge-Based Systems. Applications, 10.1007/978-3-319-91479-4_61, (739-747), (2018).
- S. Moral-García, Carlos J. Mantas, Javier G. Castellano, Joaqu’ın Abellán, Using Credal-C4.5 with Binary Relevance for Multi-Label Classification, Journal of Intelligent & Fuzzy Systems, 10.3233/JIFS-18746, (1-12), (2018).
- Xiaojing Fan, Deqiang Han, Yi Yang, Wei Ai, undefined, 2018 Chinese Automation Congress (CAC), 10.1109/CAC.2018.8623044, (3104-3108), (2018).
- Yue Wu, Luke S Blunden, AbuBakr S Bahaj, City-wide building height determination using light detection and ranging data, Environment and Planning B: Urban Analytics and City Science, 10.1177/2399808318774336, (239980831877433), (2018).
- Carlos J. Mantas, Javier G. Castellano, Serafín Moral-García, Joaquín Abellán, A comparison of random forest based algorithms: random credal random forest versus oblique random forest, Soft Computing, 10.1007/s00500-018-3628-5, (2018).
- Joaquín Abellán, Carlos J. Mantas, Javier G. Castellano, A Random Forest approach using imprecise probabilities, Knowledge-Based Systems, 10.1016/j.knosys.2017.07.019, 134, (72-84), (2017).
- Joaquín Abellán, Javier G. Castellano, A comparative study on base classifiers in ensemble methods for credit scoring, Expert Systems with Applications, 10.1016/j.eswa.2016.12.020, 73, (1-10), (2017).
- Joaquín Abellán, Javier G. Castellano, Carlos J. Mantas, A New Robust Classifier on Noise Domains: Bagging of Credal C4.5 Trees, Complexity, 10.1155/2017/9023970, 2017, (1-17), (2017).
- Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy, Entropy, 10.3390/e19060247, 19, 6, (247), (2017).
- Yuhan Huang, Li Zhuo, Haiyan Tao, Qingli Shi, Kai Liu, A Novel Building Type Classification Scheme Based on Integrated LiDAR and High-Resolution Images, Remote Sensing, 10.3390/rs9070679, 9, 7, (679), (2017).
- Weiwei Yuan, Donghai Guan, Qi Zhu, Tinghuai Ma, Novel mislabeled training data detection algorithm, Neural Computing and Applications, 10.1007/s00521-016-2589-9, 29, 10, (673-683), (2016).
- Carlos J. Mantas, Joaquín Abellán, Javier G. Castellano, Analysis of Credal-C4.5 for classification in noisy domains, Expert Systems with Applications, 10.1016/j.eswa.2016.05.035, 61, (314-326), (2016).
- Fabrício O. de França, André L.V. Coelho, A biclustering approach for classification with mislabeled data, Expert Systems with Applications, 10.1016/j.eswa.2015.02.045, 42, 12, (5065-5075), (2015).
- Benoit Frenay, Michel Verleysen, Classification in the Presence of Label Noise: A Survey, IEEE Transactions on Neural Networks and Learning Systems, 10.1109/TNNLS.2013.2292894, 25, 5, (845-869), (2014).
- Joaquín Abellán, Carlos J. Mantas, Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring, Expert Systems with Applications, 10.1016/j.eswa.2013.12.003, 41, 8, (3825-3830), (2014).
- Mostafa El Habib Daho, Nesma Settouti, Mohammed El Amine Lazouni, Mohammed El Amine Chikh, undefined, 2014 International Conference on Multimedia Computing and Systems (ICMCS), 10.1109/ICMCS.2014.6911187, (438-443), (2014).
- Carlos J. Mantas, Joaquín Abellán, Analysis and extension of decision trees based on imprecise probabilities: Application on noisy data, Expert Systems with Applications, 10.1016/j.eswa.2013.09.050, 41, 5, (2514-2525), (2014).
- Joaquín Abellán, Rebecca M. Baker, Frank P.A. Coolen, Richard J. Crossman, Andrés R. Masegosa, Classification with decision trees from a nonparametric predictive inference perspective, Computational Statistics & Data Analysis, 10.1016/j.csda.2013.02.009, 71, (789-802), (2014).
- Carlos J. Mantas, Joaquín Abellán, Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data, Expert Systems with Applications, 10.1016/j.eswa.2014.01.017, 41, 10, (4625-4637), (2014).
- Carlos J. Mantas, Joaquín Abellán, Credal Decision Trees to Classify Noisy Data Sets, Hybrid Artificial Intelligence Systems, 10.1007/978-3-319-07617-1_60, (689-696), (2014).
- Griselda López, Laura Garach, Joaquín Abellán, Javier G. Castellano, Carlos J. Mantas, Using Imprecise Probabilities to Extract Decision Rules via Decision Trees for Analysis of Traffic Accidents, Rough Sets and Current Trends in Soft Computing, 10.1007/978-3-319-08644-6_30, (288-298), (2014).
- References, Introduction to Imprecise Probabilities, undefined, (338-373), (2014).
- Joaquín Abellán, Ensembles of decision trees based on imprecise probabilities and uncertainty measures, Information Fusion, 10.1016/j.inffus.2012.03.003, 14, 4, (423-430), (2013).
- Joaquín Abellán, Griselda López, Juan de Oña, Analysis of traffic accident severity using Decision Rules via Decision Trees, Expert Systems with Applications, 10.1016/j.eswa.2013.05.027, 40, 15, (6047-6054), (2013).
- Joaquín Abellán, An application of Non-Parametric Predictive Inference on multi-class classification high-level-noise problems, Expert Systems with Applications, 10.1016/j.eswa.2013.01.066, 40, 11, (4585-4592), (2013).
- Joaquín Abellán, Equivalence relations among dominance concepts on probability intervals and general credal sets, International Journal of General Systems, 10.1080/03081079.2011.607449, 41, 2, (109-122), (2012).
- JOAQUÍN ABELLÁN, ANDRÉS R. MASEGOSA, IMPRECISE CLASSIFICATION WITH CREDAL DECISION TREES, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10.1142/S0218488512500353, 20, 05, (763-787), (2012).
- Joaquín Abellán, Andrés R. Masegosa, Bagging schemes on the presence of class noise in classification, Expert Systems with Applications, 10.1016/j.eswa.2012.01.013, 39, 8, (6827-6837), (2012).
- MAHDI KHODABANDEH, ALIREZA MOHAMMAD-SHAHRI, TWO GENERALIZATIONS OF AGGREGATED UNCERTAINTY MEASURE FOR EVALUATION OF DEZERT–SMARANDACHE THEORY, International Journal of Information Technology & Decision Making, 10.1142/S021962201250006X, 11, 01, (119-142), (2012).
- Marco Zaffalon, Giorgio Corani, Denis Mauá, Evaluating credal classifiers by utility-discounted predictive accuracy, International Journal of Approximate Reasoning, 10.1016/j.ijar.2012.06.022, 53, 8, (1282-1301), (2012).
- Joaquín Abellán, Rebecca M. Baker, Frank P.A. Coolen, Maximising entropy on the nonparametric predictive inference model for multinomial data, European Journal of Operational Research, 10.1016/j.ejor.2011.01.020, 212, 1, (112-122), (2011).
- Sebastien Destercke, Handling bipolar knowledge with imprecise probabilities, International Journal of Intelligent Systems, 10.1002/int.20475, 26, 5, (426-443), (2011).
- JOAQUÍN ABELLÁN, ANDRÉS R. MASEGOSA, A FILTER-WRAPPER METHOD TO SELECT VARIABLES FOR THE NAIVE BAYES CLASSIFIER BASED ON CREDAL DECISION TREES, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10.1142/S0218488509006297, 17, 06, (833-854), (2011).
- Joaquín Abellán, Andrés R. Masegosa, An ensemble method using credal decision trees, European Journal of Operational Research, 10.1016/j.ejor.2009.12.003, 205, 1, (218-226), (2010).
- Joaquín Abellán, Andrés R. Masegosa, Bagging Decision Trees on Data Sets with Classification Noise, Foundations of Information and Knowledge Systems, 10.1007/978-3-642-11829-6_17, (248-265), (2010).
- Jens D. Nielsen, Rafael Rumí, Antonio Salmerón, Supervised classification using probabilistic decision graphs, Computational Statistics & Data Analysis, 10.1016/j.csda.2008.11.003, 53, 4, (1299-1311), (2009).
- Joaquín Abellán, Andrés R. Masegosa, An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 10.1007/978-3-642-02906-6_39, (446-456), (2009).
- Joaquín Abellán, Andrés Masegosa, Requirements for total uncertainty measures in Dempster–Shafer theory of evidence, International Journal of General Systems, 10.1080/03081070802082486, 37, 6, (733-747), (2008).
- María M. Abad-Grau, Jorge Ierache, Claudio Cervino, Paola Sebastiani, Evolution and challenges in the design of computational systems for triage assistance, Journal of Biomedical Informatics, 10.1016/j.jbi.2008.01.007, 41, 3, (432-441), (2008).
- Joaquín Abellán, Andrés R. Masegosa, Split Criterions for Variable Selection Using Decision Trees, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 10.1007/978-3-540-75256-1_44, (489-500), (2007).
- Joaquín Abellán, Andrés R. Masegosa, Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 10.1007/978-3-540-75256-1_46, (512-523), (2007).
- Edward Hung, Lise Getoor, V. S. Subrahmanian, Probabilistic interval XML, ACM Transactions on Computational Logic, 10.1145/1276920.1276926, 8, 4, (24), (2007).
- J. Abellán, Uncertainty measures on probability intervals from the imprecise Dirichlet model, International Journal of General Systems, 10.1080/03081070600687643, 35, 5, (509-528), (2006).
- Joaquín Abellán, Manuel Gómez, Measures of divergence on credal sets, Fuzzy Sets and Systems, 10.1016/j.fss.2005.11.021, 157, 11, (1514-1531), (2006).
- Joaquín Abellán, Serafín Moral, Manuel Gómez, Andrés Masegosa, Varying Parameter in Classification Based on Imprecise Probabilities, Soft Methods for Integrated Uncertainty Modelling, 10.1007/3-540-34777-1, (231-239), (2006).
- Jean-Marc Bernard, An introduction to the imprecise Dirichlet model for multinomial data, International Journal of Approximate Reasoning, 10.1016/j.ijar.2004.10.002, 39, 2-3, (123-150), (2005).




