of users (n) because more reliable inferences can be
made with more data. For EM data set, recall rapidly
decreases while we changed n from 500 to 250 or 125.
Accuracy becomes stable for n values larger than 250.
Changes in recall values are more stable for MLP data
set. Recall decreases when n is increased from 125 to
250. On the other hand, it improves with increasing n
from 250 to 500 and 943.
7 CONCLUSIONS AND FUTURE
WORK
We proposed a privacy-preserving scheme to offer
top-N recommendations efficiently. We determined
the best similarity measures by performing experi-
ments. Utilizing ratings data is more successful for
building a model for top-N recommendations. Apart
from disguising the original data, a random filling
methodology is necessary to provide appropriate pri-
vacy preservation for hiding both ratings and rated
items. According to our results, Dice and Jaccard
measures perform the best. Kulzinsky similarity mea-
sure is not a good choice among the eight ones. It
gives the worst results. Generally speaking, six of
eight measures provide promising results. We scruti-
nized the effects of varying f values on recall. More-
over, we investigated the effect of varying n values.
We determined the optimum values of f and n.
Without privacy concerns, our results on ratings
data are very comparable with the ones presented
in (Karypis, 2001). Although accuracy diminishes
with privacy, the results are still promising compared
to the results in (Karypis, 2001). Our scheme achieves
privacy by sacrificing some accuracy. Compared to
the scheme proposed by (Kaleli and Polat, 2007), our
scheme’s online performance is much more better.
As a future work, we are planning to evaluate bi-
nary similarity measures on clustering data to con-
struct a user-based model as a different research area
in CF and apply dissimilarity measures to determine if
they can perform better than similarity measures. We
will investigate whether we can reduce the accuracy
losses due to underlying privacy-preserving measures
or not by applying various improvements.
REFERENCES
Blattner, M. (2009). B-rank: A top N recommendation al-
gorithm. CoRR, arXiv:0908.2741.
Canny, J. (2002). Collaborative filtering with privacy via
factor analysis. In ACM SIGIR’02, 25th Annual Inter-
national ACMSIGIR Conference on Research and De-
velopment in Information Retrieval, pages 238–245,
Tampere, Finland.
Cha, S., Yoon, S., and Tappert, C. C. (2005). On binary
similarity measures for handwritten character recog-
nition. In ICDAR’05, 8th International Conference
on Document Analysis and Recognition, pages 4–8,
Seoul, Korea.
Cranor, L. F. (2003). ‘I didn’t buy it for myself’ privacy
and E-commerce personalization. In WPES’03, ACM
Workshop on Privacy in the Electronic Society, pages
111–117, Washington, DC, USA.
Gan, G., Ma, C., and Wu, J. (2007). Data Clustering: The-
ory, Algorithms, and Applications, chapter 6 Similar-
ity and Dissimilarity Measures, pages 67 – 106. ASA-
SIAM Series on Statistics and Applied Probability.
Huang, C. L. and Huang, W. L. (2009). Handling sequential
pattern decay: Developing a two-stage collaborative
recommender system. Electron. Commer. Rec. Appl.,
8(3):117–129.
Jamali, M. and Ester, M. (2009). Using a trust network to
improve top-N recommendation. In RecSys’09, 3rd
ACM Conference on Recommender Systems, pages
181–188, New York, NY, USA.
Kaleli, C. and Polat, H. (2007). Providing private recom-
mendations using naive Bayesian classifier. Advances
in Intelligent Web Mastering, 43:515–522.
Kaleli, C. and Polat, H. (2010). P2P collaborative filtering
with privacy. Turkish J. Elec. Eng. and Comp. Sci.,
18(1):101–116.
Karypis, G. (2001). Evaluation of item-based top-N rec-
ommendation algorithms. In CIKM’01, 10th Inter-
national Conference on Information and Knowledge
Management, pages 247–254, Atlanta, GA, USA.
Kwon, Y. (2008). Improving top-n recommendation tech-
niques using rating variance. In RecSys’08, 2nd ACM
Conference on Recommender Systems, pages 307–
310, Lausanne, Switzerland.
McJonese, P. (1997). EachMovie collaborative filtering data
set.
Polat, H. and Du, W. (2005). Privacy-preserving top-N
recommendation on horizontally partitioned data. In
WI’05, IEEE/WIC/ACM International Conference on
Web Intelligence, pages 725–731, Paris, France.
Polat, H. and Du, W. (2008). Privacy-preserving top-N rec-
ommendation on distributed data. J. Am. Soc. Inf. Sci.
Technol., 59(7):1093–1108.
Sarwar, B., Karypis, G., Konstan, J. A., and Riedl, J. T.
(2001). Item-based collaborative filtering recommen-
dation algorithms. In WWW’10, 10th International
World Wide Web Conference, pages 285–295, Hong
Kong, China.
Zhang, B. and Srihari, S. N. (2003). Binary vector dis-
similarity measures for handwriting identification. In
Proc. of SPIE, Document Recognition and Retrieval
X, pages 28–38, Santa Clara, CA, USA.
ICSOFT 2010 - 5th International Conference on Software and Data Technologies
304