Lexicon Expansion System for Domain and Time Oriented Sentiment Analysis

Nuno Guimaraes, Luis Torgo, Alvaro Figueira

Abstract

In sentiment analysis the polarity of a text is often assessed recurring to sentiment lexicons, which usually consist of verbs and adjectives with an associated positive or negative value. However, in short informal texts like tweets or web comments, the absence of such words does not necessarily indicates that the text lacks opinion. Tweets like ”First Paris, now Brussels... What can we do?” imply opinion in spite of not using words present in sentiment lexicons, but rather due to the general sentiment or public opinion associated with terms in a specific time and domain. In order to complement general sentiment dictionaries with those domain and time specific terms, we propose a novel system for lexicon expansion that automatically extracts the more relevant and up to date terms on several different domains and then assesses their sentiment through Twitter. Experimental results on our system show an 82% accuracy on extracting domain and time specific terms and 80% on correct polarity assessment. The achieved results provide evidence that our lexicon expansion system can extract and determined the sentiment of terms for domain and time specific corpora in a fully automatic form.

References

  1. Amazon (2016). Amazon mechanical turk. https:// www.mturk.com/mturk/welcome. Acessed: 2016-08- 21.
  2. Amer-Yahia, S., Anjum, S., Ghenai, A., Siddique, A., Abbar, S., Madden, S., Marcus, A., and El-Haddad, M. (2012). MAQSA: a system for social analytics on news. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 653-656.
  3. Apache (2010). Apache OpenNLP . apache.org/. Acessed: 2016-08-21.
  4. Baccianella, S., Esuli, A., and Sebastiani, F. (2010). Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Chair), N. C. C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., and Tapias, D., editors, Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
  5. Bravo-Marquez, F., Frank, E., and Pfahringer, B. (2015a). From unlabelled tweets to twitter-specific opinion words. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 7815, pages 743-746, New York, NY, USA. ACM.
  6. Bravo-Marquez, F., Frank, E., and Pfahringer, B. (2015b). Positive, negative, or neutral: Learning an expanded opinion lexicon from emoticon-annotated tweets. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence , IJCAI 7815.
  7. Cambria, E., Olsher, D., and Rajagopal, D. (2014). Senticnet 3: A common and common-sense knowledge base for cognition-driven sentiment analysis. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, AAAI'14, pages 1515-1521. AAAI Press.
  8. Crowdflower (2014). Introducing contributor performance levels. http:// crowdflowercommunity.tumblr.com/ post/80598014542/introducing-contributorperformance-levels. Acessed: 2016-04-10.
  9. CrowdFlower (2016). Crowdflower: Make your data useful. https://www.crowdflower.com/. Acessed: 2016- 08-21.
  10. Crowdflower (2016). Data for everyone. www.crowdflower.com/data-for-everyone. 2016-04-10.
  11. Dictionary, U. (2016). Urban dictionary. www. urbandictionary.com. Acessed: 2016-08-21.
  12. Ding, X., Liu, B., and Yu, P. S. (2008). A holistic lexiconbased approach to opinion mining.
  13. Du, W., Tan, S., Cheng, X., and Yun, X. (2010). Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM 7810, pages 111-120, New York, NY, USA. ACM.
  14. Esuli, A. and Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06, pages 417-422.
  15. Fellbaum, C., editor (1998). WordNet: An Electronic Lexical Database. MIT Press.
  16. Feng, S., Bose, R., and Choi, Y. (2011). Learning general connotation of words using graph-based algorithms. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 7811, pages 1092-1103, Stroudsburg, PA, USA. Association for Computational Linguistics.
  17. Go, A., Bhayani, R., and Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1:12.
  18. Hatzivassiloglou, V. and McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, ACL 7898, pages 174- 181, Stroudsburg, PA, USA. Association for Computational Linguistics.
  19. Jong, F., and Kaymak, U. (2015). Exploiting emoticons in polarity classification of text. J. Web Eng., 14(1-2):22- 40.
  20. Hu, M. and Liu, B. (2004a). Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 7804, pages 168-177, New York, NY, USA. ACM.
  21. Hu, M. and Liu, B. (2004b). Mining opinion features in customer reviews. In Proceedings of the 19th National Conference on Artifical Intelligence , AAAI'04, pages 755-760. AAAI Press.
  22. Hutto, C. J. and Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Adar, E., Resnick, P., Choudhury, M. D., Hogan, B., and Oh, A. H., editors, ICWSM. The AAAI Press.
  23. Kim, S.-M. and Hovy, E. (2004). Determining the sentiment of opinions. In Proceedings of the 20th International Conference on Computational Linguistics, COLING 7804, Stroudsburg, PA, USA. Association for Computational Linguistics.
  24. Kiritchenko, S., Zhu, X., and Mohammad, S. M. (2014). Sentiment analysis of short informal texts. J. Artif. Int. Res., 50(1):723-762.
  25. Mohammad, S., Dunne, C., and Dorr, B. (2009). Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2, EMNLP 7809, pages 599-608, Stroudsburg, PA, USA. Association for Computational Linguistics.
  26. Mohammad, S. M. and Turney, P. D. (2010). Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, CAAGET 7810, pages 26-34, Stroudsburg, PA, USA. Association for Computational Linguistics.
  27. Moreo, A., Romero, M., Castro, J., and Zurita, J. (2012). Lexicon-based comments-oriented news sentiment analyzer system. Expert Syst. Appl., 39(10):9166- 9180.
  28. Nguyen, L. T., Wu, P., Chan, W., Peng, W., and Zhang, Y. (2012). Predicting collective sentiment dynamics from time-series social media. In Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining, WISDOM 7812, pages 6:1-6:8, New York, NY, USA. ACM.
  29. Nielsen, F. A. (2011). Afinn.
  30. Novak, P. K., Smailovic, J., Sluban, B., and Mozetic, I. (2015). Sentiment of emojis. CoRR, abs/1509.07761.
  31. Oxford (2016). Oxford Learner's Dictionaries topic dictionaries. http://www.oxfordlearnersdictionaries.com/ topic/. Acessed: 2016-07-03.
  32. Phuvipadawat, S. and Murata, T. (2010). Breaking news detection and tracking in twitter. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on, volume 3, pages 120-123.
  33. Qiu, G., Liu, B., Bu, J., and Chen, C. (2011). Opinion Word Expansion and Target Extraction Through Double Propagation. Comput. Linguist., 37(1):9-27.
  34. Stone, P. J., Dunphy, D. C., Smith, M. S., and Ogilvie, D. M. (1966). The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge, MA.
  35. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., and Stede, M. (2011). Lexicon-based methods for sentiment analysis. Comput. Linguist., 37(2):267-307.
  36. Tang, D., Wei, F., Qin, B., Zhou, M., and Liu, T. (2014). Building large-scale twitter-specific sentiment lexicon : A representation learning approach. In COLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, August 23-29, 2014, Dublin, Ireland, pages 172-182.
  37. Thelwall, M., Buckley, K., and Paltoglou, G. (2012). Sentiment strength detection for the social web. J. Am. Soc. Inf. Sci. Technol., 63(1):163-173.
  38. Twitter (2015a). Twitter Company about. https://about. twitter.com/company. Acessed: 2015-10-19.
  39. Twitter (2015b). Twitter Company rest. https://dev. twitter.com/rest/public. Acessed: 2015-10-19.
  40. Twitter (2016). Twitter Developers. https://dev.twitter.com/. Acessed: 2016-03-08.
  41. Wang, H., Can, D., Kazemzadeh, A., Bar, F., and Narayanan, S. (2012). A system for real-time twitter sentiment analysis of 2012 u.s. presidential election cycle. In Proceedings of the ACL 2012 System Demonstrations, ACL 7812, pages 115-120, Stroudsburg, PA, USA. Association for Computational Linguistics.
  42. Zhang, L. and Liu, B. (2011). Identifying noun product features that imply opinions. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, HLT 7811, pages 575-580, Stroudsburg, PA, USA. Association for Computational Linguistics.
Download


Paper Citation


in Harvard Style

Guimaraes N., Torgo L. and Figueira A. (2016). Lexicon Expansion System for Domain and Time Oriented Sentiment Analysis . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 463-471. DOI: 10.5220/0006081704630471


in Bibtex Style

@conference{kdir16,
author={Nuno Guimaraes and Luis Torgo and Alvaro Figueira},
title={Lexicon Expansion System for Domain and Time Oriented Sentiment Analysis},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={463-471},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006081704630471},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Lexicon Expansion System for Domain and Time Oriented Sentiment Analysis
SN - 978-989-758-203-5
AU - Guimaraes N.
AU - Torgo L.
AU - Figueira A.
PY - 2016
SP - 463
EP - 471
DO - 10.5220/0006081704630471