HASHMAX: A NEW METHOD FOR MINING MAXIMAL FREQUENT ITEMSETS
Natalia Vanetik, Ehud Gudes
2011
Abstract
Mining maximal frequent itemsets is a fundamental problem in many data mining applications, especially in the case of dense data when the search space is exponential. We propose a top-down algorithm that employs hashing techniques, named HashMax, in order to generate maximal frequent itemsets efficiently. An empirical evaluation of our algorithm in comparison with the state-of-the-art maximal frequent itemset generation algorithm Genmax shows the advantage of HashMax in the case of dense datasets with a large amount of maximal frequent itemsets.
References
- Agarwal, R., Aggarwal, C., and Prasad, V. (2000). Depth first generation of long patterns. In ACM SIGKDD Conf.
- Bayardo, R. J. (1998). Efficiently mining long patterns from databases. In ACM SIGMOD Conf. on Management of Data, pages 85-93.
- Burdick, D., Calimlim, M., and Gehrke, J. (2001). Mafia, a maximal frequent itemset algorithm for transactional databases. In IEEE Intl. Conf. on Data Engineering, pages 443-452.
- Genmax (2011). Genmax implementation. http://www.cs.rpi.edu/ zaki/wwwnew/pmwiki.php/Software.
- Gouda, K. and Zaki, M. J. (2005). Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Mining and Knowledge Discovery 11(3), pages 223- 242.
- Han, J., Pei, J., and Yin, Y. (2000). Mining frequent patterns without candidate generation. In ACM SIGMOD Conf. on Management of Data, pages 1-12.
- Hu, T., Sung, S. Y., Xiong, H., and Fu, Q. (2008). Discovery of maximum length frequent itemsets. In Inf. Sci. 178(1), pages 69-87.
- Lin, D.-I. and Kedem, Z. M. (1998). Pincer search: A new algorithm for discovering the maximum frequent set. In EDBT, pages 105-119.
- UCI (2011). Uci machine learning data repository. http://archive.ics.uci.edu/ml/index.html.
- Yang, G. (2004). The complexity of mining maximal frequent itemsets and maximal frequent patterns. In KDD, pages 344-353.
- Zaki, M. J., Parthasarathy, S., Ogihara, M., and Li, W. (1997). New algorithms for fast discovery of association rules. In Third Int 1 Conf. on Knowledge Discovery in Databases and Data Mining, pages 283-286.
Paper Citation
in Harvard Style
Vanetik N. and Gudes E. (2011). HASHMAX: A NEW METHOD FOR MINING MAXIMAL FREQUENT ITEMSETS . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 132-137. DOI: 10.5220/0003628101400145
in Bibtex Style
@conference{kdir11,
author={Natalia Vanetik and Ehud Gudes},
title={HASHMAX: A NEW METHOD FOR MINING MAXIMAL FREQUENT ITEMSETS},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={132-137},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003628101400145},
isbn={978-989-8425-79-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - HASHMAX: A NEW METHOD FOR MINING MAXIMAL FREQUENT ITEMSETS
SN - 978-989-8425-79-9
AU - Vanetik N.
AU - Gudes E.
PY - 2011
SP - 132
EP - 137
DO - 10.5220/0003628101400145