Hybrid Sentiment Analyser for Arabic Tweets using R

Sarah Alhumoud, Tarfa Albuhairi, Wejdan Alohaideb


Harvesting meaning out of massively increasing data could be of great value for organizations. Twitter is one of the biggest public and freely available data sources. This paper presents a Hybrid learning implementation to sentiment analysis combining lexicon and supervised approaches. Analysing Arabic, Saudi dialect Twitter tweets to extract sentiments toward a specific topic. This was done using a dataset consisting of 3000 tweets collected in three domains. The obtained results confirm the superiority of the hybrid learning approach over the supervised and unsupervised approaches.


  1. Abdulla, N. Ahmed, N. Shehab, M. & Al-Ayyoub, M. (2013) Arabic Sentiment Analysis: Lexicon-Based and Corpus-Based. Proceedings of the IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). Amman, pp. 1- 6.
  2. Adamov, A. (2014) Data Mining and Analysis in Depth. Case Study of QAFQAZ University HTTP Server Log Analysis. Proceedings of the IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), Astana, pp.1-4.
  3. Alhumoud, S. Altuwaijri, M. Albuhairi, T. & Alohaideb, W. (2015) Survey on Arabic Sentiment Analysis in Twitter. Proceedings of the International Conference on Computer Science and Information Technology (ICCSIT), Paris, pp. 364 - 368.
  4. Arab Social Media. (2014) Twitter in the Arab Region, Dubai School of Government, [Online] Available from: https://shar.es/129INW. [Accessed: 16th June 2015].
  5. Dinh, N. Huiyuan, Z. Nguyen, .T. & Thai, T. (2014) CostEffective Viral Marketing for Time-Critical Campaigns in Large-Scale Social Networks, IEEE/ACM Transactions on Networking, 22, pp.2001- 2011
  6. EthraArab. (2015) ????? Stop Words ???????? ?????, [Online] Available from: http://wp.me/p5VrNb-5o. [Accessed: 8th July 2015].
  7. Feinerer, I. & Hornik, K. (2015) Package 'tm'. [Online] Available from: http://cran.rproject.org/web/packages/tm/tm.pdf. [Accessed: 17th June 2015].
  8. Fiaidhi, J.Mohammed, O.Mohammed, S.Fong, S.& Kim, T. (2012) Opinion Mining Over Twitterspace: Classifying Tweets Programmatically Using The R Approach. Proceedings of the Seventh International Conference on Digital Information Management, Macau, pp. 313 - 319.
  9. Freedom House, (2015) Freedom in the World, [Online] Available from: https://freedomhouse.org/reporttypes/freedom-world#.VZcL2PkW6wN. [Accessed: 4th July 2015].
  10. Han, J. Kamber, M. & Pei, J. (2000) Data Mining: Concepts and Techniques. Morgan Kaufmann.
  11. Horakova, M. (2015) Sentiment Analysis Tool Using Machine Learning. Proceedings of the 2nd Global Conference on Computer Science, Software, Networks and Engineering. Turkey, pp. 192-204.
  12. HOSCH, L. (2014) Using Machine Learning To Classify Open Source Projects. Department of Computer Science, Bachelor Thesis, University of FriedrichAlexander. [Online] Available from: http://dirkriehle.com/uploads/byhand/theses/2014/hoes ch_2014_arbeit.pdf. [Accessed: 16th June 2015].
  13. IBM. (2011) Customer Analytics Pay off, IBM. [Online] Available from: http://www-01.ibm.com/software/ analytics/rte/an/customer-analytics/. [Accessed: 1st September 2015].
  14. Ibrahim, H. Abdou, S. & gheith, M. (2015) Sentiment Analysis for Modern Standard Arabic and Colloquial, International Journal on Natural Language Computing (4). [Online] Available from: arxiv.org/ftp/arxiv/ papers/1505/1505.03105.pdf. [Accessed: 27th Aug 2015].
  15. Jovic, A. Brkic, K. & Bogunovic, N. (2014) An Overview Of Free Software Tools For General Data Mining. Proceedings of the International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, pp. 1112 - 1117.
  16. Jurka, T. Collingwood, L. Boydstun, A. Grossman, E. & Atteveldt, W. (2015) Package 'RTextTools'. [Online] Available from: http://cran.r-project.org/web/ packages/RTextTools/RTextTools.pdf. [Accessed: 17th June 2015].
  17. KD Nuggets. (2013) Top Languages For Analytics, Data Mining, Data Science, [Online] Available from: http://www.kdnuggets.com/2013/08/languages-foranalytics-data-mining-data-science.html. [Accessed: 16th June 2015].
  18. Keka, I. & Hamiti, M. (2013) Load Profile Analyses Using R Language. Proceedings of the International Conference on Information Technology Interfaces (ITI), Cavtat, pp. 245 - 250.
  19. Khasawneh, R. Wahsheh, H. Al Kabi M. & Aismadi, I. (2013) Sentiment Analysis of Arabic Social Media Content: A Comparative Study. Proceedings ofthe8th International Conference for Internet Technology and Secured Transactions (ICITST). London, pp.101-106.
  20. Kosorus, H. Honigl, J. & Kung, J. (2011) Using R, WEKA and RapidMiner in Time Series Analysis of Sensor Data for Structural Health Monitoring. Proceedings of the 2nd International Workshop on Database and Expert Systems Applications, Toulouse, pp. 306 - 310.
  21. Kumar, P. Ozisikyilmaz, B. Liao, W. Memik, G. & Choudhary A. (2011) High Performance Data Mining Using R on Heterogeneous Platforms. Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PHD Forum (IPDPSW), Shanghai, pp. 1720-1729.
  22. Liu, B. (2012) Sentiment Analysis and Opinion Mining. Morgan & Claypool.
  23. Magoulas, R. & King, J. (2014) 2013 Data Science Salary Survey Tools, Trends, What Pays (and What Doesn't) For Data Professionals. O'Reilly, [Online] Available from: http://www.oreilly.com/data/free/files/strata survey.pdf. [Accessed: 16th June 2015].
  24. Medhat, W, Hassan, A. Korashy H. (2014) Sentiment analysis algorithms and applications: A survey, Ain Shams Engineering Journal (5). P. 1093-1113. [Online] Available from: http://www.sciencedirect. com/science/article/pii/S2090447914000550.[Accesse d: 27th Aug 2015].
  25. Mohammed Bin Rashid School of Government. (2014) The Arab World Online 2014: Trends in Internet and Mobile Usage in the Arab Region, Mohammed Bin Rashid School of Government. [Online] Available from: http://www.mbrsg.ae/getattachment/ff70c2c5- 0fce-405d-b23f-93c198d4ca44/The-Arab-WorldOnline-2014-Trends-in-Internet-and.aspxl. [Accessed: 2th July 2015].
  26. Nisa, K. Andrianto, H. & Mardhiyyah, R. (2014) Hotspot Clustering Using DBSCAN Algorithm and Shiny Web Framework. Proceedings of the International Conference on Advanced Computer Science and Information Systems (ICACSIS), Jakarta, pp. 129-132.
  27. NLP for Arabic. (2012) Arabic MPQA Subjective Lexicon & Arabic Opinion Holder Corpus. [Online] Available from: http://nlp4arabic.blogspot.com/2012/05/arabicmpqa-subjective-lexicon-arabic.html. [Accessed: 17th June 2015].
  28. Ofek, N. Rokach, L. Caragea, C. & Yen, J. (2015) The Importance of Pronouns to Sentiment Analysis: Online Cancer Survivor Network Case Study, Proceedings of the 24th International Conference on World Wide Web Companion, pp. 83-84.
  29. Reyaee, S. & Ahmed, A. (2015) Growth Pattern of Social Media Usage in Arab Gulf States: An Analytical Study. Social Networking, 4, pp. 23-32.
  30. R- project. (2015) The R Project for Statistical Computing, [Online] Available from: http://www.r-project.org/l. [Accessed: 4th July 2015].
  31. Shoukry, A. & Rafea, A. (2012) Sentence Level Arabic Sentiment Analysis. Proceedings of the International Conference on Collaboration Technologies and Systems, Denver, USA, pp. 546 - 550.
  32. Sokolova, M. Japkowicz, N. and Szpakowicz, S. (2006) Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures For Performance Evaluation. In Hutchison, D. Kanade, T. Kittler, J. Kleinberg, J.M. Mattern, F. Mitchell, J.C. Naor, M. Pandu Rangan, C. Steffen, B. Terzopoulos, D. Tygar, D. & Weikum, G. (Eds.). AI 2006: Advances in Artificial Intelligence. Lecture Notes in Computer Science (4304). Australia: Springer Berlin Heidelberg.
  33. Stop Words (2014) Stop-Words[Online] Available From: Https://Code.Google.Com/P/Stop-Words/. [Accessed: 17th June 2015].
  34. Storck, M. (2011) The Role of Social Media in Political Mobilisation: a Case Study of the January 2011 Egyptian Uprising. [Online] Available from: http://www.culturaldiplomacy.org/academy/content/pd f/participant-papers/2012-02-bifef/The_Role_of_ Social_Media_in_Political_Mobilisation_- _Madeline_Storck.pdf/. [Accessed: 2th July 2015].
  35. Tsatsoulis, C. & Hofmann, M. (2014) Focusing On Maximum Entropy Classification of Lyrics byTom Waits. Proceedings of the IEEE InternationalAdvance Computing Conference, Gurgaon, pp. 664 - 667.
  36. Vinodhini G. & Chandrasekaran, RM. (2012) Sentiment Analysis and Opinion Mining: A Survey. International Journal of Advanced Research in Computer Science and Software Engineering. (2). P. 283- 292. [Online] Available from: http://www.dmi.unict.it/faro/tesi/ sentiment_analysis/SA2.pdf. [Accessed: 16th June 2015].

Paper Citation

in Harvard Style

Alhumoud S., Albuhairi T. and Alohaideb W. (2015). Hybrid Sentiment Analyser for Arabic Tweets using R . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 417-424. DOI: 10.5220/0005616204170424

in Bibtex Style

author={Sarah Alhumoud and Tarfa Albuhairi and Wejdan Alohaideb},
title={Hybrid Sentiment Analyser for Arabic Tweets using R},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},

in EndNote Style

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Hybrid Sentiment Analyser for Arabic Tweets using R
SN - 978-989-758-158-8
AU - Alhumoud S.
AU - Albuhairi T.
AU - Alohaideb W.
PY - 2015
SP - 417
EP - 424
DO - 10.5220/0005616204170424