Authors:
Vítor Bernardes
1
and
Álvaro Figueira
2
Affiliations:
1
Faculty of Sciences, University of Porto, Rua do Campo Alegre, Porto, Portugal
;
2
CRACS / INESCTEC, University of Porto, Porto, Portugal
Keyword(s):
Fake News, Social Media, Machine Learning, NLP.
Abstract:
The recent proliferation of so called “fake news” content, assisted by the widespread use of social media platforms and with serious real-world impacts, makes it imperative to find ways to mitigate this problem. In this paper we propose a machine learning-based approach to tackle it by automatically identifying tweets associated with questionable content, using newly-collected data from Twitter about the 2020 U.S. presidential election. To create a sizable annotated data set, we use an automatic labeling process based on the factual reporting level of links contained in tweets, as classified by human experts. We derive relevant features from that data and investigate the specific contribution of features derived from named entity and emotion recognition techniques, including a novel approach using sequences of prevalent emotions. We conclude the paper by evaluating and comparing the performance of several machine learning models on different test sets, and show they are applicable to
addressing the issue of fake news dissemination.
(More)