SEARCH

SEARCH BY CITATION

Abstract

Inspired by the dependency degree γ, a traditional measure in Rough Set Theory, we propose a generalized dependency degree, Γ, between two given sets of attributes, which counts both deterministic and indeterministic rules while γ counts only deterministic rules. We first give its definition in terms of equivalence relations and then interpret it in terms of minimal rules, and further describe the algorithm for its computation. To understand Γ better, we investigate its various properties. We further extend Γ to incomplete information systems. To show its advantage, we make a comparative study with the conditional entropy and γ in a number of experiments. Experimental results show that the speed of the new C4.5 using Γ is greatly improved when compared with the original C4.5R8 using conditional entropy, while the prediction accuracy and tree size of the new C4.5 are comparable with the original one. Moreover, Γ achieves better results on attribute selection than γ. The study shows that the generalized dependency degree is an informative measure in decision trees and in attribute selection.