Authors: Noufa Alnajran ; Keeley Crockett ; David McLean and Annabel Latham

Affiliation: Manchester Metropolitan University, United Kingdom

ISBN: 978-989-758-220-2

ISSN: 2184-433X

Keyword(s): Clustering, Social Network Analysis, Twitter, Data Mining, Machine Learning.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Data Mining ; Databases and Information Systems Integration ; Enterprise Information Systems ; Sensor Networks ; Signal Processing ; Soft Computing

Abstract: Twitter, a microblogging online social network (OSN), has quickly gained prominence as it provides people with the opportunity to communicate and share posts and topics. Tremendous value lies in automated analysing and reasoning about such data in order to derive meaningful insights, which carries potential opportunities for businesses, users, and consumers. However, the sheer volume, noise, and dynamism of Twitter, imposes challenges that hinder the efficacy of observing clusters with high intra-cluster (i.e. minimum variance) and low inter-cluster similarities. This review focuses on research that has used various clustering algorithms to analyse Twitter data streams and identify hidden patterns in tweets where text is highly unstructured. This paper performs a comparative analysis on approaches of unsupervised learning in order to determine whether empirical findings support the enhancement of decision support and pattern recognition applications. A review of the literature identif ied 13 studies that implemented different clustering methods. A comparison including clustering methods, algorithms, number of clusters, dataset(s) size, distance measure, clustering features, evaluation methods, and results was conducted. The conclusion reports that the use of unsupervised learning in mining social media data has several weaknesses. Success criteria and future directions for research and practice to the research community are discussed. (More)

