Unsupervised Twitter Sentiment Classification

Mihaela Dinsoreanu, Andrei Bacu

Abstract

Sentiment classification is not a new topic but data sources having different characteristics require customized methods to exploit the hidden existing semantic while minimizing the noise and irrelevant information. Twitter represents a huge pool of data having specific features. We propose therefore an unsupervised, domain-independent approach, for sentiment classification on Twitter. The proposed approach integrates NLP techniques, Word Sense Disambiguation and unsupervised rule-based classification. The method is able to differentiate between positive, negative, and objective (neutral) polarities for every word, given the context in which it occurs. Finally, the overall tweet polarity decision is taken by our proposed rule-based classifier. We performed a comparative evaluation of our method on four public datasets specialized for this task and the experimental results obtained are very good compared to other state-of-the-art methods, considering that our classifier does not use any training corpus.

References

  1. Agerri, R., García-Serrano, A. (2010, May). Q-WordNet: Extracting Polarity from WordNet Senses. In LREC.
  2. Akkaya, C., Wiebe, J., Mihalcea, R. (2009, August). Subjectivity word sense disambiguation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1- Volume 1 (pp. 190-199). Association for Computational Linguistics.
  3. Andreevskaia, A., Bergler, S. (2006, April). Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses. In EACL (Vol. 6, pp. 209-215).
  4. Baccianella, S., Esuli, A., Sebastiani, F. (2010, May). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In LREC (Vol. 10, pp. 2200-2204).
  5. Banerjee, S., Pedersen, T. (2003, August). Extended gloss overlaps as a measure of semantic relatedness. In IJCAI (Vol. 3, pp. 805-810).
  6. Bradley, M. M., Lang, P. J. Affective Norms for English Words (ANEW) Instruction Manual and Affective Ratings. Technical Report C-1, The Center for Research in Psychophysiology University of Florida, 2009.
  7. Bravo-Marquez, F., Mendoza, M., Poblete, B. (2013, August). Combining strengths, emotions and polarities for boosting Twitter sentiment analysis. In Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining (p. 2). ACM.
  8. Das, S., Chen, M., (2001) Yahoo! for Amazon: Extracting market sentiment from stock message boards. In Proceedings of APFA-2001. 2001.
  9. Denecke, K. (2008, April). Using Sentiwordnet for multilingual sentiment analysis. In Data Engineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on (pp. 507-512). IEEE.
  10. Esuli, A., Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC (Vol. 6, pp. 417-422).
  11. Go, A., Bhayani, R., Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1-12.
  12. Jiang, J. J., Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008.
  13. Kamps, J., Marx M., (2002). Words with attitude. In Proceedings of the 1st International Conference on Global WordNet, pages 332-341. CIIL, Mysore India.
  14. Liu, K. L., Li, W. J., Guo, M. (2012). Emoticon Smoothed Language Models for Twitter Sentiment Analysis. In AAAI.
  15. Mihalcea, R., Banea, C., Wiebe, J. (2007). Learning multilingual subjective language via cross-lingual projections. In ANNUAL MEETING-ACL (Vol. 45, No. 1, p. 976).
  16. Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.
  17. Mohammad, S. M., Kiritchenko, S., Zhu, X. (2013). NRCCanada: Building the state-of-the-art in sentiment analysis of tweets. arXiv preprint arXiv:1308.6242.
  18. Nakov, P., Kozareva, Z., Ritter, A., Rosenthal, S., Stoyanov, V., Wilson, T. (2013). Semeval-2013 task 2: Sentiment analysis in twitter.
  19. Narr, S., Hülfenhaus, M., Albayrak, S. (2012). Languageindependent Twitter sentiment analysis. In KDML workshop on knowledge discovery, data mining and machine learning.
  20. Nielsen, F. Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903.
  21. Ortega R., Fonseca A., Gutierrez Y. and Montoyo A.,(2013) SSA-UO: Unsupervised Twitter Sentiment Analysis, in SemEval 2013.
  22. Pang B., Lee L., and Vaithyanathan S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceeding of Empirical Methods in Natural Language Processing, pages 79-86.
  23. Pang B, Lee L. (2008) Opinion Mining and Sentiment Analysis, Foundations and Trends in Information Retrieval, Vol. 2, Nos. 1-2, pp. 1-135, 2008.
  24. Patwardhan S., (2003). Incorporating dictionary and corpus information into a Context Vector Measure of Semantic Relatedness. Master's thesis, Dept. of Computer Science, University of Minnesota, Duluth.
  25. Riloff, E., Wiebe, J. (2003) Learning Extraction Patterns for Subjective Expressions, Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP-03).
  26. Saif, H., Fernandez, M., He, Y., Alani, H. (2013). Evaluation datasets for twitter sentiment analysis. In Proceedings ESSEM in Conjunction with AI* IA Conference, Turin, Italy.
  27. Thelwall, M., Buckley, K., Paltoglou, G. (2012). Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology, 63(1), 163-173.
  28. Thelwall, M. (2013). Heart and soul: Sentiment strength detection in the social web with sentistrength. Cyberemotions, 1-14.
  29. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544- 2558.
  30. Thurlow, C., & Brown, A. (2003). Generation Txt? The sociolinguistics of young people's text-messaging. Discourse analysis online, 1(1), 30.
  31. Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. Pages 417-424.
  32. Wiebe, J., Mihalcea, R. (2006). Word sense and subjectivity. In Proceedings of the 21st ICCL and the 44th annual meeting of the ACL (pp. 1065-1072).
  33. Wiegand M., Balahur A., Roth B., Klakow D., Montoyo A. (2010). A survey on the role of negation in sentiment analysis. In Proceedings of NeSp-NLP 7810, pages 60-68.
  34. Wilson T., Kozareva Z., Nakov P., Rosenthal S., Stoyanov V., Ritter A. (2013). SemEval-2013 task 2: Sentiment analysis in twitter. In Proceedings of the International Workshop on Semantic Evaluation, SemEval 2013, June.
  35. Wilson, T., Wiebe, J., Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 347-354).
  36. Wilson, T., Wiebe, J., Hoffmann, P. (2009). Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational linguistics, 35(3), 399-433.
Download


Paper Citation


in Harvard Style

Dinsoreanu M. and Bacu A. (2014). Unsupervised Twitter Sentiment Classification . In Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014) ISBN 978-989-758-050-5, pages 220-227. DOI: 10.5220/0005079002200227


in Bibtex Style

@conference{kmis14,
author={Mihaela Dinsoreanu and Andrei Bacu},
title={Unsupervised Twitter Sentiment Classification},
booktitle={Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014)},
year={2014},
pages={220-227},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005079002200227},
isbn={978-989-758-050-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014)
TI - Unsupervised Twitter Sentiment Classification
SN - 978-989-758-050-5
AU - Dinsoreanu M.
AU - Bacu A.
PY - 2014
SP - 220
EP - 227
DO - 10.5220/0005079002200227