For example by using the words ‘Senate’ or ‘Prime
Minister’. Thus the set of keywords could be ex-
tended according to the desired application.
The method used in this research also lacks a tech-
nique to process replies. A solution to this could be
to link the reply to the original tweet, and separate
both texts. This can be very useful when studying the
effect of the spread of messages in social networks.
Finally, the machine learning models could be
tweaked further to optimize the results. In this pro-
cess, called hyperparameter optimization, the model
settings are adjusted accordingly to the dataset. Fu-
ture work is going to be carried in improving the pa-
rameters of the models. We also aim to use the classi-
fier in other works related to social network analysis
of political positions and social contagion of political
opinions in networks.
ACKNOWLEDGEMENTS
E.F.M. Ara
´
ujo’s stay at the VU University Amster-
dam is funded by the Brazilian Science without Bor-
ders Program, through a fellowship given by the Co-
ordination for the Improvement of Higher Education
Personnel CAPES (reference 13538-13-6).
REFERENCES
Asur, S. and Huberman, B. A. (2010). Predicting the future
with social media. In Web Intelligence and Intelligent
Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM
International Conference on, volume 1, pages 492–
499. IEEE.
Babyak, M. A. (2004). What you see may not be what you
get: a brief, nontechnical introduction to overfitting
in regression-type models. Psychosomatic medicine,
66(3):411–421.
Cohen, R. and Ruths, D. (2013). Classifying political ori-
entation on twitter: It’s not easy! In ICWSM.
Conover, M., Ratkiewicz, J., Francisco, M. R., Gonc¸alves,
B., Menczer, F., and Flammini, A. (2011a). Political
polarization on twitter. ICWSM, 133:89–96.
Conover, M. D., Gonc¸alves, B., Ratkiewicz, J., Flammini,
A., and Menczer, F. (2011b). Predicting the politi-
cal alignment of twitter users. In Privacy, Security,
Risk and Trust (PASSAT) and 2011 IEEE Third In-
ernational Conference on Social Computing (Social-
Com), 2011 IEEE Third International Conference on,
pages 192–199. IEEE.
Croft, B., Metzler, D., and Strohman, T. (2009). Search
Engines: Information Retrieval in Practice. Addison-
Wesley Publishing Company, USA, 1st edition.
Golbeck, J. and Hansen, D. (2014). A method for comput-
ing political preference among twitter followers. So-
cial Networks, 36:177–184.
He, Y., Saif, H., Wei, Z., and Wong, K.-F. (2012). Quan-
tising opinions for political tweets analysis. In LREC
2012, Eighth International Conference on Language
Resources and Evaluation.
Hong, L., Convertino, G., and Chi, E. H. (2011). Language
matters in twitter: A large scale study. In ICWSM.
Joachims, T. (1998). Text categorization with support vec-
tor machines: Learning with many relevant features.
Machine learning: ECML-98, pages 137–142.
Liu, B. (2012). Sentiment analysis and opinion mining.
Synthesis lectures on human language technologies,
5(1):1–167.
Liu, W. and Ruths, D. (2013). What’s in a name? using
first names as features for gender inference in twitter.
In AAAI spring symposium: Analyzing microtext, vol-
ume 13, page 01.
Manning, C. D., Raghavan, P., and Sch
¨
utze, H. (2008). In-
troduction to Information Retrieval. Cambridge Uni-
versity Press, New York, NY, USA.
Maynard, D. and Funk, A. (2011). Automatic detection
of political opinions in tweets. In Extended Seman-
tic Web Conference, pages 88–99. Springer.
Mohammad, S. M., Zhu, X., Kiritchenko, S., and Martin,
J. (2015). Sentiment, emotion, purpose, and style in
electoral tweets. Information Processing and Man-
agement, 51(4).
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
Rajadesingan, A. and Liu, H. (2014). Identifying users with
opposing opinions in Twitter debates. Lecture Notes in
Computer Science (including subseries Lecture Notes
in Artificial Intelligence and Lecture Notes in Bioin-
formatics), 8393 LNCS:153–160.
Sang, E. T. K. and Bos, J. (2012). Predicting the 2011 dutch
senate election results with twitter. In Proceedings of
the workshop on semantic analysis in social media,
pages 53–60. Association for Computational Linguis-
tics.
Sylwester, K. and Purver, M. (2015). Twitter language use
reflects psychological differences between democrats
and republicans. PloS one, 10(9):e0137422.
Tong, S. and Koller, D. (2001). Support vector machine
active learning with applications to text classification.
Journal of machine learning research, 2(Nov):45–66.
Tumasjan, A., Sprenger, T. O., Sandner, P. G., and Welpe,
I. M. (2010). Predicting elections with twitter:
What 140 characters reveal about political sentiment.
ICWSM, 10(1):178–185.
Detecting Dutch Political Tweets: A Classifier based on Voting System using Supervised Learning
469