Author:
Ulli Waltinger
Affiliation:
Text Technology, Bielefeld University, Germany
Keyword(s):
Machine learning, Support vector machine, Sentiment analysis, Polarity identification, Subjectivity resources.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Soft Computing
;
Symbolic Systems
;
Web Mining
Abstract:
This paper presents an empirical study on machine learning-based sentiment analysis. Though polarity classification has been extensively studied at different document-structure levels (e.g. document, sentence, words), little work has been done investigating feature selection methods and subjectivity resources. We systematically analyze four different English subjectivity resources for the task of sentiment polarity identification. While the results show that the size of dictionaries clearly correlate to polarity-based feature coverage, this property does not correlate to classification accuracy. Using polarity-based feature selection, considering a minimum amount of prior polarity features, in combination with SVM-based machine learning methods exhibits the best performance (acc=84.1, f1=83.9), in comparison to the classical approaches on polarity identification. Based on the findings of the English-based experimental setup, a new German subjectivity resource is proposed for the task
of German-based sentiment analysis. The results of the experiments show, with f1=85.9 its good adaptability to the new domain.
(More)