CHARACTERIZING RELATIONSHIPS THROUGH CO-CLUSTERING - A Probabilistic Approach

Nicola Barbieri, Gianni Costa, Giuseppe Manco, Ettore Ritacco

2011

Abstract

In this paper we propose a probabilistic co-clustering approach for pattern discovery in collaborative filtering data. We extend the Block Mixture Model in order to learn about the structures and relationships within preference data. The resulting model can simultaneously cluster users into communities and items into categories. Besides its predictive capabilities, the model enables the discovery of significant knowledge patterns, such as the analysis of common trends and relationships between items and users within communities/categories. We reformulate the mathematical model and implement a parameter estimation technique. Next, we show how the model parameters enable pattern discovery tasks, namely: (i) to infer topics for each items category and characteristic items for each user community; (ii) to model community interests and transitions among topics. Experiments on MovieLens data provide evidence about the effectiveness of the proposed approach.

References

  1. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993-1022.
  2. Cremonesi, P., Koren, Y., and Turrin, R. (2010). Performance of recommender algorithms on top-n recommendation tasks. In RecSys, pages 39-46.
  3. Funk, S. (2006). Netflix update: Try this at home.
  4. George, T. and Merugu, S. (2005). A scalable collaborative filtering framework based on co-clustering. In ICDM, pages 625-628.
  5. Gerard, G. and Mohamed, N. (2003). Clustering with block mixture models. Pattern Recognition, 36(2):463-473.
  6. Govaert, G. and Nadif, M. (2005). An em algorithm for the block mixture model. IEEE Trans. Pattern Anal. Mach. Intell., 27(4):643-647.
  7. Hofmann, T. and Puzicha, J. (1999). Latent class models for collaborative filtering. In IJCAI, pages 688-693.
  8. Jin, R., Si, L., and Zhai, C. (2006). A study of mixture models for collaborative filtering. Inf. Retr., 9(3):357- 382.
  9. Jin, X., Zhou, Y., and Mobasher, B. (2004). Web usage mining based on probabilistic latent semantic analysis. In KDD, pages 197-205.
  10. McNee, S., Riedl, J., and Konstan, J. A. (2006). Being accurate is not enough: How accuracy metrics have hurt recommender systems. In ACM SIGCHI Conference on Human Factors in Computing Systems, pages 1097-1101.
  11. Porteous, I., Bart, E., and Welling, M. (2008). Multi-hdp: a non parametric bayesian model for tensor factorization. In AAAI, pages 1487-1490.
  12. Shan, H. and Banerjee, A. (2008). Bayesian co-clustering. In ICML.
  13. Shannon, C. E. (1951). Prediction and entropy of printed english. Bell Systems Technical Journal, 30:50-64.
  14. Wang, P., Domeniconi, C., and Laskey, K. B. (2009). Latent dirichlet bayesian co-clustering. In ECML PKDD, pages 522-537.
  15. Wu, H. C., Luk, R. W. P., Wong, K. F., and Kwok, K. L. (2008). Interpreting tf-idf term weights as making relevance decisions. ACM Trans. Inf. Syst., 26:13:1- 13:37.
  16. Ziegler, C.-N., McNee, S. M., Konstan, J. A., and Lausen, G. (2005). Improving recommendation lists through topic diversification. In WWW, pages 22-32.
Download


Paper Citation


in Harvard Style

Barbieri N., Costa G., Manco G. and Ritacco E. (2011). CHARACTERIZING RELATIONSHIPS THROUGH CO-CLUSTERING - A Probabilistic Approach . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 64-73. DOI: 10.5220/0003656800640073


in Bibtex Style

@conference{kdir11,
author={Nicola Barbieri and Gianni Costa and Giuseppe Manco and Ettore Ritacco},
title={CHARACTERIZING RELATIONSHIPS THROUGH CO-CLUSTERING - A Probabilistic Approach},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={64-73},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003656800640073},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - CHARACTERIZING RELATIONSHIPS THROUGH CO-CLUSTERING - A Probabilistic Approach
SN - 978-989-8425-79-9
AU - Barbieri N.
AU - Costa G.
AU - Manco G.
AU - Ritacco E.
PY - 2011
SP - 64
EP - 73
DO - 10.5220/0003656800640073