SELECTIVELY LEARNING CLUSTERS IN MULTI-EAC

André Lourenço, Ana Fred

Abstract

The Multiple-Criteria Evidence Accumulation Clustering (Multi-EAC) method, is a clustering ensemble approach with an integrated cluster stability criterion used to selectively learn the similarity from a collection of different clustering algorithms. In this work we analyze the original Multi-EAC criterion in the context of the classical relative validation criteria, and propose alternative cluster validation indices for the selection of clusters based on pairwise similarities. Taking several clustering ensemble construction strategies as context, we compare the adequacy of each criterion and provide guidelines for its application. Experimental results on benchmark data sets show the proposed concepts.

References

  1. Asuncion, A. and Newman, D. (2007). UCI ML repository.
  2. Ayad, H. G. and Kamel, M. S. (2008). Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell., 30(1):160-173.
  3. Ben-Hur, A., Elisseeff, A., and Guyon, I. (2002). A stability based method for discovering structure in clustered data. In Pacific Symposium on Biocomputing.
  4. Bezdek, J. C. and Pal, N. R. (1995). Cluster validation with generalized dunn's indices. In ANNES 7895: Proceedings of the 2nd New Zealand Two-Stream International Conference on Artificial Neural Networks and Expert Systems, page 190, Washington, DC, USA. IEEE Computer Society.
  5. Bolshakova, N. and Azuaje, F. (2003). Cluster validation techniques for genome expression data. Signal Process., 83(4):825-833.
  6. Dubes, R. and Jain, A. (1979). Validity studies in clustering methodologies. Pattern Recognition, 11:235-254.
  7. Dunn, J. C. (1974). A fuzzy relative of the isodata process and its use in detecting compact, well separated clusters. Cybernetics and Systems, 3(3):32-57.
  8. Fern, X. Z. and Brodley, C. E. (2004). Solving cluster ensemble problems by bipartite graph partitioning. In ICML 7804: Proceedings of the twenty-first international conference on Machine learning, page 36, New York, NY, USA. ACM.
  9. Fred, A. (2001). Finding consistent clusters in data partitions. In Kittler, J. and Roli, F., editors, Multiple Classifier Systems, volume 2096, pages 309-318. Springer.
  10. Fred, A. and Jain, A. (2005). Combining multiple clustering using evidence accumulation. IEEE Trans Pattern Analysis and Machine Intelligence, 27(6):835-850.
  11. Fred, A. and Jain, A. (2006). Learning pairwise similarity for data clustering. In Proc. of the 18th Int'l Conference on Pattern Recognition (ICPR), volume 1, pages 925-928, Hong Kong.
  12. Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2001). On clustering validation techniques. Intelligent Information Systems Journal, 17(2-3):107-145.
  13. Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2002a). Cluster validity methods: Part i. SIGMOD Record.
  14. Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2002b). Cluster validity methods: Part ii. SIGMOD Record.
  15. Jain, A. and Dubes, R. (1988). Algorithms for Clustering Data. Prentice Hall.
  16. Jain, A. K. and Moreau, J. V. (1987). Bootstrap technique in cluster analysis. Pattern Recognition, 20:547 - 568.
  17. Lange, T., Braun, M., Roth, V., and Buhmann, J. (2002). Stability-based model selection. In NIPS, pages 617- 624.
  18. Levine, E. and Domany, E. (2000). Resampling method for unsupervised estimation of cluster validity. Aaa.
  19. Ng, A. Y., Jordan, M. I., and Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. In T. G. Dietterich, S. B. and Ghahramani, Z., editors, Advances in Neural Information Processing Systems 14, Cambridge, MA.
  20. Roth, V., Lange, T., Braun, M., and Buhmann, J. (2002). A resampling approach to cluster validation. In Computational Statistics-COMPSTAT.
  21. Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53- 65.
  22. Sergios Theodoridis, K. K. (1999). Pattern Recogniton. Academic Press.
  23. Strehl, A. and Ghosh, J. (2002). Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. of Machine Learning Research 3.
  24. Topchy, A., Jain, A. K., and Punch, W. (2005). Clustering ensembles: Models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell., 27(12):1866- 1881.
  25. Verma, D. and Meila, M. (2003). A comparision of spectral clustering algorithms. Technical report, UW CSE Technical report.
Download


Paper Citation


in Harvard Style

Lourenço A. and Fred A. (2010). SELECTIVELY LEARNING CLUSTERS IN MULTI-EAC . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 491-499. DOI: 10.5220/0003099904910499


in Bibtex Style

@conference{kdir10,
author={André Lourenço and Ana Fred},
title={SELECTIVELY LEARNING CLUSTERS IN MULTI-EAC},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={491-499},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003099904910499},
isbn={978-989-8425-28-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - SELECTIVELY LEARNING CLUSTERS IN MULTI-EAC
SN - 978-989-8425-28-7
AU - Lourenço A.
AU - Fred A.
PY - 2010
SP - 491
EP - 499
DO - 10.5220/0003099904910499