Authors:
Ahmed Rafea
;
Ahmed El Kholy Sherif
and
G. Aly
Affiliation:
American University in Cairo, Egypt
Keyword(s):
Clustering, Bisecting K-mean algorithm, Social network, Discussion groups.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
This paper proposes applying Bisecting K-means algorithm, to cluster the social network discussion groups and providing a meaningful label to the cluster containing these groups. The clustering of the discussion groups is based on the heterogeneous meta-features that define each group; e.g. title, description, type, sub-type, network. The main ideas is to represent each group as a tuple of multiple feature vectors and construct a proper similarity measure to each feature space then perform the clustering using the proposed bisecting K-means clustering algorithm. The main key phrases are extracted from the titles and descriptions of the discussion groups of a given cluster and combined with the main meta-features to build a phrase label of the cluster. The analysis of the experiments results showed that combining more than one feature produced better clustering in terms of quality and interrelationship between the discussion groups of a given cluster. Some features like the Network
improved the compactness and tightness of the cluster objects within the clusters while other features like the type and subtype improves the separation of the clusters.
(More)