
into political orientation with a very high accuracy.  
In particular, we show that using only the Facebook 
party pages, which is publicly available, as the 
training set, obtained the highest accuracy 
classification results for the individual Facebook 
pages. This result can be explained by the relatively 
high resemblance between the most characteristic 
features in both the private and party Facebook 
pages. In both corpora, the right wing is 
characterized by references to religion and 
patriotism, as well as first-person pronouns, while 
the left wing is characterized by references to 
protests and third-person pronouns. The significance 
of this result is that it suggests that using only 
inherently tagged data like party pages can be used 
to classify non-political pages. This saves the need 
to gather personal pages already labeled for political 
orientation as training examples. 
Newspapers are commonly assumed neutral and 
objective; however, seemingly the general 
population perceives and associates each newspaper 
with a certain political orientation. In this research, 
we were able to confirm the general consensus 
regarding the newspapers' political orientation by 
applying the classifier we built using the corpora of 
party pages and parliamentary speeches.  
REFERENCES 
Argamon, S., M. Koppel, J. Fine, and A. R. Shimoni, 
2003, 'Gender, genre, and writing style in formal 
written texts', Text, vol. 23, pp. 321-346.  
Argamon, S., M. Koppel, J. W. Pennebaker & J. Schler, 
2009, 'Automatically profiling the author of an 
anonymous text', Communications of the ACM, vol. 
52, no. 2, pp. 119-123. 
Burger, J. D., J. Henderson, G. Kim & G. Zarrella, 2011, 
'Discriminating gender on Twitter', Proc. of EMNLP- 
11, pp. 1301-1309.  
Bachrach, Y., Michal Kosinski, T. Graepel, Pushmeet 
Kohli, & D. Stillwell, 2012, 'Personality and patterns 
of Facebook usage'. Proceedings of the 3rd annual 
ACM web science conference, June, 2012, Evanston, 
US, pp. 24-32. ACM. 
Efron, A., 2004: 'Cultural orientation: Classifying 
subjective documents by co-citation [sic] analysis', 
Proceedings of the AAAI Fall Symposium on Style and 
Meaning in Language, Art, Music, and Design, pp. 41-
48. 
Filippova, K., 2012: 'User Demographics and Language in 
an Implicit Social Network', Proceedings of the 2012 
Joint Conference on Empirical Methods in Natural 
Language Processing and Computational Natural 
Language Learning, pp. 1478-1488.  
Genkin, A, D. D. Lewis, & D. Madigan, 2007, 'Large-
scale Bayesian logistic regression for text 
categorization'. Technimetrics, vol. 49 no. 3, pp. 291-
304. 
Gosling, S. D., A. A. Augustine, S. Vazire, N. Holtzman, 
& S. Gaddis, 2011, 'Manifestations of Personality in 
Online Social Networks: Self-Reported Facebook-
Related Behaviors and Ob-servable Profile 
Information'. Cyber psychology, Behavior, and Social 
Networking, vol. 14 no. 9, pp. 483-488.  
Grefenstette, G, Y Qu, J G Shanahan, & D A Evans 2004, 
'Coupling niche browsers and affect analysis for an 
opinion mining application'. Proceedings of RIAO, pp. 
186-194.  
Hassanali K. N. & V Hatzivassiloglou, 2010, 'Automatic 
Detection of Tags for Political Blogs'. Proceedings of 
the NAACL HLT 2010 Workshop on Computational 
Linguistics in a World of Social Media, pp. 21-22.  
Koppel, M., J. Schler, & K. Zigdon, 2005, 'Deter-mining 
an Author's Native Language by Mining a Text for 
Errors',  Proceedings of KDD, Chicago IL, pp. 624-
628.  
Kosinski, M., D. Stillwell, & T. Graepel, 2013, 'Private 
traits and attributes are predictable from digital records 
of human behavior'.  Proceedings of the National 
Academy of Science of the United States of America 
(PNAS), pp. 5802-5805.  
Laver, M., K. Benoit & J. Garry, 2003, 'Extracting policy 
positions from political texts using words as data'. 
American Political Science Review, vol. 97 no. 2, pp. 
311-331.  
Mullen T., & R. Malouf, 2006, 'A preliminary 
investigation into sentiment analysis of informal 
political discourse'. Proceedings of the AAAI 
Symposium on Computational Approaches to 
Analyzing Weblogs, pp. 159-162.  
Otterbacher, J., 2010, 'Inferring gender of movie 
reviewers: Exploiting writing style, content and 
metadata'. Proceedings of CIKM-10. 
Popescu, A. & G. Grafenstette, 2010, 'Mining user home 
location and gender from Flickr tags', Proceedings of 
ICWSM-10, 369-378. 
Pennebaker, J., W. Mehl & K. Niedehoffer, 2003, 'Effects 
of age and gender on blogging'. Annual Review of 
Psychology 2003, pp. 547-577. 
Rao, D., D. Yarowsky, A. Shreevats, & M. Gupta, 2010, 
'Classifying Latent User Attributes in Twitter'. 
Proceedings of the 2nd international workshop on 
Search and mining user-generated contents SMUC 
'10, pp. 37-44.  
Rosenthal, S., & K. McKeown, 2011, 'Age prediction in 
blogs: A study of style, content, and online behavior in 
pre- and post-social media generations'. Proceedings 
of the 49th Annual Meeting of the Association for 
Computational Linguistics: Human Language 
Technologies, 1, pp. 763-772. ACM. 
Schler, J., M. Koppel, S. Argamon & J. W. Pennebaker, 
2006,  'Effects of age and gender on blogging'.  AAAI 
2006 Spring Symposium on Computational 
Approaches to Analyzing Weblogs, Stanford, CA, pp. 
199-206. 
AutomaticPoliticalProfilinginHeterogeneousCorpora
481