ON BINARY SIMILARITY MEASURES FOR PRIVACY-PRESERVING TOP-N RECOMMENDATIONS

Alper Bilge, Cihan Kaleli, Huseyin Polat

Abstract

Collaborative filtering (CF) algorithms fundamentally depend on similarities between users and/or items to predict individual preferences. There are various binary similarity measures like Kulzinsky, Sokal-Michener, Yule, and so on to estimate the relation between two binary vectors. Although binary ratings-based CF algorithms are utilized, there remains work to be conducted to compare the performances of binary similarity measures. Moreover, the success of CF systems enormously depend on reliable and truthful data collected from many customers, which can only be achieved if individual users’ privacy is protected. In this study, we compare eight binary similarity measures in terms of accuracy while providing top-N recommendations. We scrutinize how such measures perform with privacy-preserving top-N recommendation process. We perform real data-based experiments. Our results show that Dice and Jaccard measures provide the best outcomes.

References

  1. Blattner, M. (2009). B-rank: A top N recommendation algorithm. CoRR, arXiv:0908.2741.
  2. Canny, J. (2002). Collaborative filtering with privacy via factor analysis. In ACM SIGIR'02, 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 238-245, Tampere, Finland.
  3. Cha, S., Yoon, S., and Tappert, C. C. (2005). On binary similarity measures for handwritten character recognition. In ICDAR'05, 8th International Conference on Document Analysis and Recognition, pages 4-8, Seoul, Korea.
  4. Cranor, L. F. (2003). 'I didn't buy it for myself' privacy and E-commerce personalization. In WPES'03, ACM Workshop on Privacy in the Electronic Society, pages 111-117, Washington, DC, USA.
  5. Gan, G., Ma, C., and Wu, J. (2007). Data Clustering: Theory, Algorithms, and Applications, chapter 6 Similarity and Dissimilarity Measures, pages 67 - 106. ASASIAM Series on Statistics and Applied Probability.
  6. Huang, C. L. and Huang, W. L. (2009). Handling sequential pattern decay: Developing a two-stage collaborative recommender system. Electron. Commer. Rec. Appl., 8(3):117-129.
  7. Jamali, M. and Ester, M. (2009). Using a trust network to improve top-N recommendation. In RecSys'09, 3rd ACM Conference on Recommender Systems, pages 181-188, New York, NY, USA.
  8. Kaleli, C. and Polat, H. (2007). Providing private recommendations using naive Bayesian classifier. Advances in Intelligent Web Mastering, 43:515-522.
  9. Kaleli, C. and Polat, H. (2010). P2P collaborative filtering with privacy. Turkish J. Elec. Eng. and Comp. Sci., 18(1):101-116.
  10. Karypis, G. (2001). Evaluation of item-based top-N recommendation algorithms. In CIKM'01, 10th International Conference on Information and Knowledge Management, pages 247-254, Atlanta, GA, USA.
  11. Kwon, Y. (2008). Improving top-n recommendation techniques using rating variance. In RecSys'08, 2nd ACM Conference on Recommender Systems, pages 307- 310, Lausanne, Switzerland.
  12. McJonese, P. (1997). EachMovie collaborative filtering data set.
  13. Polat, H. and Du, W. (2005). Privacy-preserving top-N recommendation on horizontally partitioned data. In WI'05, IEEE/WIC/ACM International Conference on Web Intelligence, pages 725-731, Paris, France.
  14. Polat, H. and Du, W. (2008). Privacy-preserving top-N recommendation on distributed data. J. Am. Soc. Inf. Sci. Technol., 59(7):1093-1108.
  15. Sarwar, B., Karypis, G., Konstan, J. A., and Riedl, J. T. (2001). Item-based collaborative filtering recommendation algorithms. In WWW'10, 10th International World Wide Web Conference, pages 285-295, Hong Kong, China.
  16. Zhang, B. and Srihari, S. N. (2003). Binary vector dissimilarity measures for handwriting identification. In Proc. of SPIE, Document Recognition and Retrieval X, pages 28-38, Santa Clara, CA, USA.
Download


Paper Citation


in Harvard Style

Bilge A., Kaleli C. and Polat H. (2010). ON BINARY SIMILARITY MEASURES FOR PRIVACY-PRESERVING TOP-N RECOMMENDATIONS . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT, ISBN 978-989-8425-22-5, pages 299-304. DOI: 10.5220/0002938702990304


in Bibtex Style

@conference{icsoft10,
author={Alper Bilge and Cihan Kaleli and Huseyin Polat},
title={ON BINARY SIMILARITY MEASURES FOR PRIVACY-PRESERVING TOP-N RECOMMENDATIONS},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT,},
year={2010},
pages={299-304},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002938702990304},
isbn={978-989-8425-22-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT,
TI - ON BINARY SIMILARITY MEASURES FOR PRIVACY-PRESERVING TOP-N RECOMMENDATIONS
SN - 978-989-8425-22-5
AU - Bilge A.
AU - Kaleli C.
AU - Polat H.
PY - 2010
SP - 299
EP - 304
DO - 10.5220/0002938702990304