Authors:
Amparo Albalate
1
;
Steffen Rhinow
1
and
David Suendermann
2
Affiliations:
1
University of Ulm, Germany
;
2
SpeechCycle Labs, United States
Keyword(s):
Divisive clustering, PoBOC, Silhouette width.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Evolutionary Computing
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
Abstract:
In this paper we propose a hierarchical, divisive, clustering algorithm, called Hierarchical Pole Based Clustering (HPoBC), which is able to find the clusters in a data set without any user input parameter such as the number of clusters k. The algorithm is based on the Pole Based Overlapping Clustering (PoBOC) (Cleuziou et al., 2004). Initially, the top hierarchy level is composed by the set of clusters discovered by the PoBOC algorithm on the dataset. Then, each single cluster is again analysed using a combination of PoBOC and cluster validity methods (silhouettes) in order to search for new possible subclusters. This process is recursively repeated on each newly retrieved cluster until the silhouette score suggests to stop any further partitioning of the cluster. The HPoBC algorithm has been compared to the original PoBOC as well as other classical hierarchical approaches on five two-dimensional, synthetic data sets, using three cluster evaluation metrics.