Authors:
Parnian Kassraie
1
;
Alireza Modirshanechi
1
and
Hamid K. Aghajan
2
Affiliations:
1
Sharif University of Technology, Iran, Islamic Republic of
;
2
Sharif University of Technology and University of Gent, Iran, Islamic Republic of
Keyword(s):
Social Media Text Mining, Sentiment Analysis, Google Trends, Twitter, Election Prediction, Gaussian Process Regression.
Abstract:
It is common to use online social content for analyzing political events. Twitter-based data by itself is not necessarily a representative sample of the society due to non-uniform participation. This fact should be noticed when predicting real-world events from social media trends. Moreover, each tweet may bare a positive or negative sentiment towards the subject, which needs to be taken into account. By gathering a large dataset of more than 370,000 tweets on 2016 US Elections and carefully validating the resulting key trends against Google Trends, a legitimate dataset is created. A Gaussian process regression model is used to predict the election outcome; we bring in the novel idea of estimating candidates’ vote shares instead of directly anticipating the winner of the election, as practiced in other approaches. Applying this method to the US 2016 Elections resulted in predicting Clinton’s majority in the popular vote at the beginning of the elections week with 1% error. The high v
ariance in Trump supporters’ behavior reported elsewhere is reflected in the higher error rate of his vote share.
(More)