DISCOVERING THE STABLE CLUSTERS BETWEEN INTERESTINGNESS MEASURES

Xuan-Hiep Huynh, Fabrice Guillet, Henri Briand

2006

Abstract

In this paper, dealing with association rules post-processing, we propose to study the correlations between 36 interestingness measures (IM), in order to better understand their behavior on data and finally to help the data miner chooses the best IMs. We used two datasets with opposite characteristics in which we extract two rulesets about 100000 rules, and the two subsets of the 1000 best rules according to IMs. The study of the correlation between IMs with PAM and AHC shows unexpected stabilities between the four ruleset, and more precisely eight stable clusters of IMs are found and described.

References

  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkano, A. (1996). Fast discovery of association rules. In Advances in Knowledge Discovery in Databases. AAAI/MIT Press.
  2. Blanchard, J., Guillet, F., Gras, R., and Briand, H. (2005a). Assessing rule interestingness with a probabilistic measure of deviation from equilibrium. In ASMDA'05, Proceedings of the 11th International Symposium on Applied Stochastic Models and Data Analysis.
  3. Blanchard, J., Guillet, F., Gras, R., and Briand, H. (2005b). Using information-theoretic measures to assess association rule interestingness. In ICDM'05, Proceedings of the 5th IEEE International Conference on Data Mining.
  4. Blanchard, J., Kuntz, P., Guillet, F., and Gras, R. (2003). Implication intensity: from the basic statistical definition to the entropic version (Chap. 28). In Statistical Data Mining and Knowledge Discovery.
  5. Carvalho, D. R., Freitas, A. A., and Ebecken, N. F. F. (2005). Evaluating the correlation between objective rule interestingness measures and real human interest. In PKDD'05, the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases.
  6. Choi, D. H., Ahn, B. S., and Kim, S. H. (2005). Prioritization of association rules in data mining: Multiple criteria decision approach. In ESA'05, Expert Sytems with Applications.
  7. Freitas, A. (1999). On rule interestingness measures. In Knowledge-Based Systems, 12(5-6). Elsevier.
  8. Gavrilov, M., Anguelov, D., Indyk, P., and Motwani, R. (2000). Mining the stock market: which measure is best? In KDD'00, Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining.
  9. Gras, R., Briand, H., Peter, P., and Philippé, J. (1996). Implicative statistical analysis. In IFCS'96, Proceedings of the Fifth Conference of the International Federation of Classification Societies. Springer-Verlag.
  10. Hilderman, R. and Hamilton, H. (2001). Knowledge Discovery and Measures of Interestingness. Kluwer Academic Publishers.
  11. Huynh, X.-H., Guillet, F., and Briand, H. (2005). Clustering interestingness measures with positive correlation. In ICEIS'05, Proceedings of the 7th International Conference on Enterprise Information Systems.
  12. Kaufman, L. and Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley.
  13. Klösgen, W. (1996). Explora: a multipattern and multistrategy discovery assistant. In Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press.
  14. Major, J. and Magano, J. (1995). Selecting among rules induced from a hurricane database. In Journal of Intelligent Information Systems 4(1).
  15. Newman, D., Hettich, S., Blake, C., and Merz, C. (1998). [UCI] Repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html. University of California, Irvine, Dept. of Information and Computer Sciences.
  16. Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases. MIT Press.
  17. Ross, S. (1987). Introduction to probability and statistics for engineers and scientists. Wiley.
  18. Tan, P.-N., Kumar, V., and Srivastava, J. (2004). Selecting the right objective measure for association analysis. In Information Systems 29(4). Elsevier.
Download


Paper Citation


in Harvard Style

Huynh X., Guillet F. and Briand H. (2006). DISCOVERING THE STABLE CLUSTERS BETWEEN INTERESTINGNESS MEASURES . In Proceedings of the Eighth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-972-8865-42-9, pages 196-201. DOI: 10.5220/0002493701960201


in Bibtex Style

@conference{iceis06,
author={Xuan-Hiep Huynh and Fabrice Guillet and Henri Briand},
title={DISCOVERING THE STABLE CLUSTERS BETWEEN INTERESTINGNESS MEASURES},
booktitle={Proceedings of the Eighth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2006},
pages={196-201},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002493701960201},
isbn={978-972-8865-42-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Eighth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - DISCOVERING THE STABLE CLUSTERS BETWEEN INTERESTINGNESS MEASURES
SN - 978-972-8865-42-9
AU - Huynh X.
AU - Guillet F.
AU - Briand H.
PY - 2006
SP - 196
EP - 201
DO - 10.5220/0002493701960201