A NON-PARAMETERISED HIERARCHICAL POLE-BASED CLUSTERING ALGORITHM (HPOBC)

Amparo Albalate, Steffen Rhinow, David Suendermann

Abstract

In this paper we propose a hierarchical, divisive, clustering algorithm, called Hierarchical Pole Based Clustering (HPoBC), which is able to find the clusters in a data set without any user input parameter such as the number of clusters k. The algorithm is based on the Pole Based Overlapping Clustering (PoBOC) (Cleuziou et al., 2004). Initially, the top hierarchy level is composed by the set of clusters discovered by the PoBOC algorithm on the dataset. Then, each single cluster is again analysed using a combination of PoBOC and cluster validity methods (silhouettes) in order to search for new possible subclusters. This process is recursively repeated on each newly retrieved cluster until the silhouette score suggests to stop any further partitioning of the cluster. The HPoBC algorithm has been compared to the original PoBOC as well as other classical hierarchical approaches on five two-dimensional, synthetic data sets, using three cluster evaluation metrics.

References

  1. Boley, D., Gini, M., Gross, R., Han, E.-H., Karypis, G., Kumar, V., Mobasher, B., Moore, J., and Hastings, K. (1999). Partitioning-based clustering for web document categorization. Decis. Support Syst., 27(3):329- 341.
  2. Cleuziou, G., Martin, L., Clavier, L., and Vrain, C. (2004). Poboc: An overlapping clustering algorithm, application to rule-based classification and textual data. In Proceedings of the 16th European Conference on Artificial Intelligence ECAI.
  3. Everitt, B. (1974). Cluster Analysis. Heinemann Educ., London.
  4. Jolion, J.-M. and Rosenfeld, A. (1989). Cluster detection in background noise. Pattern Recogn., 22(5):603-607.
  5. Kaufmann, L. and Rousseeuw, P. J. (1990). Finding Groups in Data. An Introduction to Cluster Analysis. Wiley, New York.
  6. Rousseeuw, P. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Jornal Comp. Appl. Math., 20:53-65.
  7. Treeck, B. (2005). Entwicklung und Evaluierung einer Java-Schnittstelle zur Clusteranalyse von Peer-toPeer Netzwerken. Bachelorarbeit. Heinrich-HeineUniversität Düsseldorf.
  8. Wu, J., Chen, J., Xiong, H., and Xie, M. (2009). External validation measures for k-means clustering: A data distribution perspective. Expert Syst. Appl., 36(3):6050-6061.
Download


Paper Citation


in Harvard Style

Albalate A., Rhinow S. and Suendermann D. (2010). A NON-PARAMETERISED HIERARCHICAL POLE-BASED CLUSTERING ALGORITHM (HPOBC) . In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-674-021-4, pages 350-356. DOI: 10.5220/0002735003500356


in Bibtex Style

@conference{icaart10,
author={Amparo Albalate and Steffen Rhinow and David Suendermann},
title={A NON-PARAMETERISED HIERARCHICAL POLE-BASED CLUSTERING ALGORITHM (HPOBC)},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2010},
pages={350-356},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002735003500356},
isbn={978-989-674-021-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A NON-PARAMETERISED HIERARCHICAL POLE-BASED CLUSTERING ALGORITHM (HPOBC)
SN - 978-989-674-021-4
AU - Albalate A.
AU - Rhinow S.
AU - Suendermann D.
PY - 2010
SP - 350
EP - 356
DO - 10.5220/0002735003500356