KERNEL OVERLAPPING K-MEANS FOR CLUSTERING IN FEATURE SPACE
Chiheb-Eddine Ben N'Cir, Nadia Essoussi, Patrice Bertrand
2010
Abstract
Producing overlapping schemes is a major issue in clustering. Recent overlapping methods rely on the search of optimal clusters and are based on different metrics, such as Euclidean distance and I-Divergence, used to measure closeness between observations. In this paper, we propose the use of kernel methods to look for separation between clusters in a high feature space. For detecting non linearly separable clusters, we propose a Kernel Overlapping k-Means algorithm (KOKM) in which we use kernel induced distance measure. The number of overlapping clusters is estimated using the Gram matrix. Experiments on different datasets show the correctness of the estimation of number of clusters and show that KOKM gives better results when compared to overlapping k-means.
References
- Banerjee, A., Krumpelman, C., Basu, S., Mooney, R., and Ghosh, J. (2005). Model based overlapping clustering. In International Conference on Knowledge Discovery and Data Mining, Chicago, USA. SciTePress.
- Ben-Hur, A., Horn, D., Siegelmann, H. T., and Vapnik, V. (2000). Support vector clustering. In International Conference on Pattern Recognition, pages 724-727, Barcelona, Spain.
- Bertrand, P. and Janowitz, M. F. (2003). The k-weak hierarchical representations: an extension of the indexed closed weak hierarchies. Discrete Applied Mathematics, 127(2):199-220.
- Camastra, F. and Verri, A. (2005). A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:801-804.
- Cleuziou, G. (2007). Okm : une extension des k-moyennes pour la recherche de classes recouvrantes. Revue des Nouvelles Technologies de l'Information, CpadusEdition RNTI-E, 2:691-702.
- Cleuziou, G. (2008). An extended version of the k-means method for overlapping clustering. In International Conference on Pattern Recognition ICPR, pages 1-4, Florida, USA. IEEE.
- Cleuziou, G. (2009). Okmed et wokm : deux variantes de okm pour la classification recouvrante. Revue des Nouvelles Technologies de l'Information, CpadusEdition, 1:31-42.
- Deodhar, M. and Ghosh, J. (2006). Consensus clustering for detection of overlapping clusters in microarray data.workshop on data mining in bioinformatics. In International Conference on data mining, pages 104- 108, Los Alamitos, CA, USA. IEEE Computer Society.
- Diday, E. (1984). Orders and overlapping clusters by pyramids. Technical Report 730, INRIA, France.
- Girolami, M. (2002). Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks, 13(13):780-784.
- Schölkopf, B., Smola, A., and Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299-1319.
- Zhang, D. and Chen, S. (2002). Fuzzy clustering using kernel method. In International Conference on Control and Automation, pages 123-127, Xiamen, China.
Paper Citation
in Harvard Style
Ben N'Cir C., Essoussi N. and Bertrand P. (2010). KERNEL OVERLAPPING K-MEANS FOR CLUSTERING IN FEATURE SPACE . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 250-256. DOI: 10.5220/0003095102500256
in Bibtex Style
@conference{kdir10,
author={Chiheb-Eddine Ben N'Cir and Nadia Essoussi and Patrice Bertrand},
title={KERNEL OVERLAPPING K-MEANS FOR CLUSTERING IN FEATURE SPACE},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={250-256},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003095102500256},
isbn={978-989-8425-28-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - KERNEL OVERLAPPING K-MEANS FOR CLUSTERING IN FEATURE SPACE
SN - 978-989-8425-28-7
AU - Ben N'Cir C.
AU - Essoussi N.
AU - Bertrand P.
PY - 2010
SP - 250
EP - 256
DO - 10.5220/0003095102500256