method that will be applied on an individual data set
will be enforced according to the individual’s
personal privacy preferences. The goal of the
proposed privacy-preservation model is to maintain a
relation between personalization and data
anonymization methods. Therefore, individuals’
personal privacy expectations will be met by using
different privacy levels. Also, the proposed model
guarantees privacy-preserved query results and
ensures personalized privacy. In this work, we created
an ontology and executed queries for the model
presented in (Usenmez and Can, 2015). Additionally,
a case study is presented. The rest of this paper is
organized as follows. In Section 2, the current
researches are introduced. In Section 3, the proposed
ontology-based privacy-preservation model is
described and exemplified. Because of its nature,
healthcare domain has quite personal information,
and patients usually prefer to protect their privacy
from others as a basic human desire to live free of
intrusion, judgment and prejudice (Project Health
Design, 2009). Hence, we used healthcare domain for
the exemplification of our work. In Section 4, a case
study is presented. In Section 5, example queries are
processed on the proposed ontology. Finally, the
paper is concluded and the future work is presented
in Section 6.
2 RELATED WORK
Numerous techniques have been proposed to provide
individual privacy while sharing or querying data
sets. Differential privacy approach protects original
data and changes the result of query by adding a
noise. In differential privacy, the researcher studies
on real data and generates statistical results. When a
query is executed on a data set, differential privacy
method adds noise to the query result. For this
purpose, the sensitivity of the query is measured.
Sensitivity is a metric that expresses how much noise
will be added to the query result in order to enhance
the distance between similar inputs and to protect
individual’s privacy on a statistical database.
Differential privacy guarantees to learn nothing about
an individual while learning useful information about
a population (Dwork and Roth, 2014). Differential
privacy ensures to protect privacy while releasing
data and to provide the optimum transformation on
data or statistical result. Therefore, privacy-
preserving data analysis is provided. (Sarathy and
Muralidhar, 2011) provides an evaluation for the
privacy and utility performance of Laplace noise
addition to numeric data.
Privacy preserving data mining methods enables
knowledge to be extracted from data while protecting
the privacy of individuals. There are several
researches in the literature related with privacy
preserving data mining. In (Mendes and Vilela,
2017), a comprehensive the most relevant privacy
preserving data mining techniques in the literature are
presented and the current challenges in the related
field are discussed. The most known methods are k-
anonymity, l-diversity and t-closeness privacy
models.
In the k-anonymity model, if each information
contained in the released dataset cannot be
distinguished from at least k-1 tuples that appears in
the released data set, then the dataset is k-anonymous
(Sweeney, 2002). (Ciriani et al., 2007) describes
generalization and suppression approaches in order to
provide k-anonymity. An enhanced k-anonymity
model is proposed in (Wong et al., 2006) to protect
identifications and sensitive relationships in a dataset.
(Kenig and Tassa) proposes an alternative k-
anonymity algorithm to achieve lower information
losses.
The l-diversity privacy model that expands the k-
anonymity model is proposed in order to provide
stronger notion of privacy. (Machanavajjhala et al,
2007) showed two attacks, the homogeneity attack
and the background knowledge attack, in order to
compromise a k-anonymous dataset. The l-diversity
model requires that each equivalence class to have at
least l different values for the sensitive attributes.
(Kern, 2013) proposes a model based on l-diversity to
reason about privacy in microdata and applies the
proposed l-diversity model to a real database.
(Li et al., 2007) showed that the l-diversity has a
number of limitations and proposed the t-closeness
privacy model that requires the distribution of a
sensitive attribute in any equivalence class is close to
the distribution of the attribute in the overall table,
where close means a threshold t. In (Ruggieri, 2014),
t-closeness is used for discrimination-aware data
mining. As stated in (Kern, 2013), each privacy
model has its own advantages and disadvantages that
have to be considered when applying such principles
to microdata. Therefore, (Soria-Comas and
Domingo-Ferrer, 2013) connects the k-anonymity,
differential privacy and t-closeness privacy models,
and also proposes a method to improve the utility of
data and to raise the risk of attribute disclosure.
In order to provide a semantic understanding,
Semantic Web based studies for privacy preserving
data mining also exist in the literature. (Martinez et
al., 2010) proposes a masking method for unbounded
categorical attributes. (Ayala-Rivera et al., 2017)
An Ontology based Personalized Privacy Preservation
501