Authors:
Saeed Samet
1
;
Ali Miri
1
and
Luis Orozco-Barbosa
2
Affiliations:
1
School of Information Technology and Engineering, University of Ottawa, Canada
;
2
Instituto de Investigacion en Informatica, Universidad de Castilla-La Mancha, Spain
Keyword(s):
Data mining, Clustering, classification, and association rules, Mining methods and algorithms, Security and Privacy Protection, Distributed data structures.
Related
Ontology
Subjects/Areas/Topics:
Database Security and Privacy
;
Information and Systems Security
;
Security in Information Systems
Abstract:
Extracting meaningful and valuable knowledge from databases is often done by various data mining algorithms. Nowadays, databases are distributed among two or more parties because of different reasons such as physical and geographical restrictions and the most important issue is privacy. Related data is normally maintained by more than one organization, each of which wants to keep its individual information private. Thus, privacy-preserving techniques and protocols are designed to perform data mining on distributed environments when privacy is highly concerned. Cluster analysis is a technique in data mining, by which data can be divided into some meaningful clusters, and it has an important role in different fields such as bio-informatics, marketing, machine learning, climate and medicine. k-means Clustering is a prominent algorithm in this category which creates a one-level clustering of data. In this paper we introduce privacy-preserving protocols for this algorithm, along with a pr
otocol for Secure comparison, known as the Millionaires’ Problem, as a sub-protocol, to handle the clustering of horizontally or vertically partitioned data among two or more parties.
(More)