WHAT MAKES US CLICK? - Modelling and Predicting the Appeal of News Articles

Elena Hensinger, Ilias Flaounas, Nello Cristianini

Abstract

We model readers’ preferences for online news, and use these models to compare different news outlets with each other. The models are based on linear scoring functions, and are inferred by exploiting aggregate behavioural information about readers’ click choices for textual content of six given news outlets over one year of time. We generate one model per outlet, and while not extremely accurate – due to limited information – these models are shown to predict the click choices of readers, as well as to being stable over time. We use those six audience preference models in several ways: to compare how the audiences’ preferences of different outlets relate to each other; to score different news topics with respect to user appeal; to rank a large number of other news outlets with respect to their content appeal to all audiences; and to explain this measure by relating it to other metrics. We discover that UK tabloids and the website of the “People” magazine contain more appealing content for all audiences than broadsheet newspapers, news aggregators and newswires, and that this measure of readers’ preferences correlates with a measure of linguistic subjectivity at the level of outlets.

References

  1. Assael, H. and A. Marvin Roscoe, J. (1976). Approaches to market segmentation analysis. The Journal of Marketing, 40(4):67-76.
  2. Boczkowski, P. J. and Mitchelstein, E. (2010). Is there a gap between the news choices of journalists and consumers? A relational and dynamic approach. The International Journal of Press/Politics, 15(4):420-440.
  3. Boser, B., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the 5th Conference on Computational Learning Theory (COLT), pages 144-152.
  4. Burgoon, M., Burgoon, J. K., and Wilkinson, M. (1981). Writing style as a predictor of newspaper readership, satisfaction and image. Journalism Quarterly, 58:225-231.
  5. Cristianini, N. and Shawe-Taylor, J. (2000). An introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press.
  6. Das, A., Datar, M., Garg, A., and Rajaram, S. (2007). Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (WWW), pages 271-280.
  7. Flaounas, I., Ali, O., Turchi, M., Snowsill, T., Nicart, F., De Bie, T., and Cristianini, N. (2011). NOAM: news outlets analysis and monitoring system. In Proceedings of the 2011 international conference on Management of data (SIGMOD 7811), pages 1275-1278.
  8. Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3):221-233.
  9. Hatzivassiloglou, V. and Wiebe, J. (2000). Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of the International Conference on Computational Linguistics, pages 299-305.
  10. Hensinger, E., Flaounas, I. N., and Cristianini, N. (2010). Learning the preferences of news readers with SVM and Lasso ranking. In Proceedings of Artificial Intelligence Applications and Innovations - 6th IFIP WG 12.5 International Conference (AIAI), pages 179-186.
  11. Hensinger, E., Flaounas, I. N., and Cristianini, N. (2011). Learning readers' news preferences with Support Vector Machines. In Proceedings of Adaptive and Natural Computing Algorithms - 10th International Conference (ICANNGA), pages 322-331.
  12. Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 133- 142.
  13. Joachims, T. (2006). Training linear SVMs in linear time. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 217-226.
  14. Liu, B. (2007). Web Data Mining, Exploring Hyperlinks, Contents, and Usage Data. Springer.
  15. Liu, J., Dolan, P., and Pedersen, E. R. (2010). Personalized news recommendation based on click behavior. In Proceedings of the 15th international conference on Intelligent user interfaces, (IUI 7810), pages 31-40.
  16. Porter, M. (1980). An algorithm for suffix stripping. Program, 14:130-137.
  17. Salton, G., Wong, A., and Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18:613-620.
  18. Sculley, D. and Wachman, G. M. (2007). Relaxed online SVMs for spam filtering. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 415-422.
  19. T.Harcup and D.O'Neill (2001). What is News? Galtung and Ruge revisited. Journalism Studies, 2(2):261- 280.
Download


Paper Citation


in Harvard Style

Hensinger E., Flaounas I. and Cristianini N. (2012). WHAT MAKES US CLICK? - Modelling and Predicting the Appeal of News Articles . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-8425-99-7, pages 41-50. DOI: 10.5220/0003728000410050


in Bibtex Style

@conference{icpram12,
author={Elena Hensinger and Ilias Flaounas and Nello Cristianini},
title={WHAT MAKES US CLICK? - Modelling and Predicting the Appeal of News Articles},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2012},
pages={41-50},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003728000410050},
isbn={978-989-8425-99-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - WHAT MAKES US CLICK? - Modelling and Predicting the Appeal of News Articles
SN - 978-989-8425-99-7
AU - Hensinger E.
AU - Flaounas I.
AU - Cristianini N.
PY - 2012
SP - 41
EP - 50
DO - 10.5220/0003728000410050