Mining Top-Rank-k Erasable Itemsets by PID_lists
Article first published online: 14 JAN 2013
© 2013 Wiley Periodicals, Inc.
International Journal of Intelligent Systems
Volume 28, Issue 4, pages 366–379, April 2013
How to Cite
Deng, Z. (2013), Mining Top-Rank-k Erasable Itemsets by PID_lists. Int. J. Intell. Syst., 28: 366–379. doi: 10.1002/int.21580
- Issue published online: 14 FEB 2013
- Article first published online: 14 JAN 2013
- National High Technology Research and Development Program of China. Grant Number: 2009AA01Z136
- National Natural Science Foundation of China. Grant Number: 90812001
Mining erasable itemsets are one of new emerging data mining tasks. In this paper, we present a new data representation called a PID_list, which keeps track of the id_nums (identification number) of products that include an itemset. On the basis of the PID_list, we propose a new algorithm called VM for mining top-rank-k erasable itemsets efficiently. The VM algorithm can avoid the time-consuming process of calculating the gain of the candidate itemsets and lots of scans of the databases. Therefore, it can accelerate the task of mining greatly. For evaluating the VM algorithm, we have conducted experiments on six synthetic product databases. Our performance study shows that the VM algorithm is efficient and much faster than the MIKE algorithm, which is the first algorithm for dealing with the problem of mining top-rank-k erasable itemsets.