Authors:
Elias de Oliveira
;
Henrique Gomes Basoni
;
Marcos Rodrigues Saúde
and
Patrick Marques Ciarelli
Affiliation:
Universidade Federal do Espírito Santo, Brazil
Keyword(s):
Text Classification, Social Network, Textmining.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Clustering and Classification Methods
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Mining Text and Semi-Structured Data
;
Process Mining
;
Symbolic Systems
Abstract:
The classification problem has got a new importance dimension with the growing aggregated value which has
been given to the Social Media such as Twitter. The huge number of small documents to be organized into
subjects is challenging the previous resources and techniques that have been using so far. Futhermore, today
more than ever, personalization is the most important feature that a system needs to exhibit. The goal of many
online systems, which are available in many areas, is to address the needs or desires of each individual user. To
achieve this goal, these systems need to be more flexible and faster in order to adapt to the user’s needs. In this
work, we explore a variety of techniques with the aim of better classify a large Twitter data set accordingly to a
user goal. We propose a methodology where we cascade an unsupervised following by supervised technique.
For the unsupervised technique we use standard clustering algorithms, and for the supervised technique we
propose the u
se of a kNN algorithm and a Centroid Based Classifier to perform the experiments. The results
are promising because we reduced the amount of work to be done by the specialists and, in addition, we were
able to mimic the human assessment decisions 0.7907 of the time, according to the F1-measure.
(More)