Unsupervised Topic Extraction from Twitter: A Feature-pivot Approach

Nada GabAllah, Ahmed Rafea

Abstract

Extracting topics from textual data has been an active area of research with many applications in our daily life. The digital content is increasing every day, and recently it has become the main source of information in all domains. Organizing and categorizing related topics from this data is a crucial task to get the best benefit out of this massive amount of information. In this paper we are presenting a feature-pivot based approach to extract topics from tweets. The approach is applied on a Twitter dataset in Egyptian dialect from four different domains. We are comparing our results to a document-pivot based approach and investigate which approach performs better to extract the topics in the underlying datasets. By applying t-test on recall, precision, and F1 measure values for both approaches on different datasets from different domains we confirmed our hypothesis that feature-pivot approach performs better in extracting topics from Egyptian dialect tweets in the datasets in question.

Download


Paper Citation