Authors:
Mehtab Alam Syed
1
;
Elena Arsevska
2
;
Mathieu Roche
1
and
Maguelonne Teisseire
3
Affiliations:
1
CIRAD, UMR TETIS, Montpellier, France
;
2
CIRAD, UMR ASTRE, Montpellier, France
;
3
INRAE, UMR TETIS, Montpellier, France
Keyword(s):
Text Mining, Sentiment Analysis, Feature Selection, Twitter.
Abstract:
In the first quarter of 2020, the World Health Organization (WHO) declared COVID-19 a public health emergency around the globe. Different users from all over the world shared their opinions about COVID-19 on social media platforms such as Twitter and Facebook. At the beginning of the pandemic, it became relevant to assess public opinions regarding COVID-19 using data available on social media. We used a recently proposed hierarchy-based measure for tweet analysis (H-TFIDF) for feature extraction over sentiment classification of tweets. We assessed how H-TFIDF and concatenation of H-TFIDF with bidirectional encoder representations from transformers (BH-TFIDF) perform over state-of-the-art bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) features for sentiment classification of COVID-19 tweets. A uniform experimental setup of the training-test (90% and 10%) split scheme was used to train the classifier. Moreover, evaluation was performed with the gold standard
expert labeled dataset to measure precision for each binary classified class.
(More)