Probabilistic Sequence Modeling for Recommender Systems

Nicola Barbieri, Antonio Bevacqua, Marco Carnuccio, Giuseppe Manco, Ettore Ritacco

Abstract

Probabilistic topic models are widely used in different contexts to uncover the hidden structure in large text corpora. One of the main features of these models is that generative process follows a bag-of-words assumption, i.e each token is independent from the previous one. We extend the popular Latent Dirichlet Allocation model by exploiting a conditional Markovian assumptions, where the token generation depends on the current topic and on the previous token. The resulting model is capable of accommodating temporal correlations among tokens, which better model user behavior. This is particularly significant in a collaborative filtering context, where the choice of a user can be exploited for recommendation purposes, and hence a more realistic and accurate modeling enables better recommendations. For the mentioned model we present a fast Gibbs Sampling procedure for the parameters estimation. A thorough experimental evaluation over real-word data shows the performance advantages, in terms of recall and precision, of the proposed sequence-modeling approach.

References

  1. Bambini, R., Cremonesi, P., and Turrin, R. (2011). A recommender system for an iptv service provider: a real large-scale production environment. In Ricci, F., Rokach, L., Shapira, B., and Kantor, P. B., editors, Recommender Systems Handbook, pages 299- 331. Springer.
  2. Barbieri, N., Costa, G., Manco, G., and Ortale, R. (2011a). Modeling item selection and relevance for accurate recommendations: a bayesian approach. In Proc. RecSys, pages 21-28.
  3. Barbieri, N. and Manco, G. (2011). An analysis of probabilistic methods for top-n recommendation in collaborative filtering. In Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I, ECML PKDD'11, pages 172-187.
  4. Barbieri, N., Manco, G., Ortale, R., and Ritacco, E. (2011b). Balancing prediction and recommendation accuracy: Hierarchical latent factors for preference data. In Proc. SDM'12.
  5. Bishop, C. (2006). Pattern Recognition and Machine Learning. Springer.
  6. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993-1022.
  7. Clauset, A., Shalizi, C., and Newman, M. E. J. (2007). Power-law distributions in empirical data. SIAM Reviews.
  8. Cremonesi, P., Koren, Y., and Turrin, R. (2010). Performance of recommender algorithms on top-n recommendation tasks. In ACM RecSys, pages 39-46.
  9. Cremonesi, P. and Turrin, R. (2009). Analysis of cold-start recommendations in iptv systems. In Proceedings of the third ACM conference on Recommender systems, RecSys 7809, pages 233-236. ACM.
  10. Griffiths, T. L., Steyvers, M., and Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review 114.
  11. Gruber, A., Weiss, Y., and Rosen-Zvi, M. (2007). Hidden topic markov models. Journal of Machine Learning Research, 2:163-170.
  12. Heinrich, G. (2008). Parameter Estimation for Text Analysis. Technical report, University of Leipzig.
  13. Minka, T. P. (2000). Estimating a Dirichlet distribution. Technical report, Microsoft Research.
  14. Wallach, H., Mimno, D., and McCallum, A. (2009). Rethinking lda: Why priors matter. In Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. K. I., and Culotta, A., editors, Advances in Neural Information Processing Systems 22, pages 1973-1981.
  15. Wallach, H. M. (2006). Topic modeling: beyond bag-ofwords. In Proceedings of the 23rd international conference on Machine learning, ICML 7806, pages 977- 984.
  16. X. Wang, A. M. and Wei, X. (2007). Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In Procs. ICDM'07, pages 697- 702.
Download


Paper Citation


in Harvard Style

Barbieri N., Bevacqua A., Carnuccio M., Manco G. and Ritacco E. (2012). Probabilistic Sequence Modeling for Recommender Systems . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012) ISBN 978-989-8565-29-7, pages 75-84. DOI: 10.5220/0004140700750084


in Bibtex Style

@conference{kdir12,
author={Nicola Barbieri and Antonio Bevacqua and Marco Carnuccio and Giuseppe Manco and Ettore Ritacco},
title={Probabilistic Sequence Modeling for Recommender Systems},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012)},
year={2012},
pages={75-84},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004140700750084},
isbn={978-989-8565-29-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012)
TI - Probabilistic Sequence Modeling for Recommender Systems
SN - 978-989-8565-29-7
AU - Barbieri N.
AU - Bevacqua A.
AU - Carnuccio M.
AU - Manco G.
AU - Ritacco E.
PY - 2012
SP - 75
EP - 84
DO - 10.5220/0004140700750084