Most of the papers regarding DP focus on theory,
considering definition, foundations and algorithms re-
lated to DP (Dwork, 2006; Dwork and Roth, 2014;
Dwork et al., 2006; Dwork and Rothblum, 2016).
(Jain and Thakurta, 2014) propose a privacy preserv-
ing algorithm for risk minimization. (McSherry and
Talwar, 2007) indicate that participants have limited
effect on the outcome of the DP mechanism. (Minami
et al., 2016) focus on the privacy of Gibbs posteriors
with convex and Lipschitz loss functions. (Mironov,
2017) discuss a new definition for differential pri-
vacy. (Foulds et al., 2016) try to bring a practical per-
spective of DP, however it focuses on the Variational
Bayes family of methods. (Apple, 2017a) present
how they determined the most used emoji while pre-
serving users privacy. We then observed that it is
missing more pragmatic approaches about how to im-
plement and use DP algorithms.
In this paper, we apply Differential Privacy in
practice. There are two main types of privatization:
Online (or adaptative or interactive) and Offline (or
batch or non-interactive) (Dwork and Roth, 2014).
The online type depends on the queries made and the
number of them (which can be limited). The offline
type of privatization does not make assumptions about
the number or type of queries made to the dataset, so
all the data can be stored already privatized. We focus
on offline methods, specifically on the Laplace mech-
anism (Dwork and Roth, 2014). We study the impact
of this DP mechanism in data analysis. Four classi-
fication algorithms were considered, including Deci-
sion Tree, Na
¨
ıve Bayes, Multi-Layer Perceptron Clas-
sifier (MLP) and Support Vector Machines (SVM).
We are then able to compare the accuracy of each al-
gorithm when using not privatized data or data with
different degrees of privatization.
This paper is organized as follows. Section 2
briefly presents DP and related methods. Section 3
presents our programming support, methodology, re-
sults and discussions. Section 4 summarizes contri-
butions and outlines future work.
2 BACKGROUND
In this section, we describe the coin method, which
is a simple example of DP. Later the definition of DP
is presented. We also shows an important DP mech-
anism called Laplace mechanism, which in turn is
a particular case of Exponential mechanism (Dwork,
2006)(Dwork and Roth, 2014).
2.1 Coin Method
(Warner, 1965) describes of a simple DP method. In
this experiment, the goal was to collect data that may
be sensitive to people and, because of that, they might
be willing to give a false answer, in order to preserve
their privacy. Let’s suppose we want to make a sur-
vey to know how many people make use of illegal
drugs. It is expected that many people that do use il-
legal drugs might lie in their answer. But in order to
get a clear look at the percentage of people that use
illegal drugs, we can use the coin mechanism in order
to preserve people’s privacy.
It goes according to Figure 1: when registering
someone’s answer, first a coin is tossed. If the result
of the first toss is Heads, we register the answer the
person gave us (represented by A). On the other hand,
if the result of the first toss is Tails, we toss the coin
again. Being the second result Heads, we register Yes
(the person does use illegal drugs); being Tails, we
register No (the person does not use illegal drugs).
Figure 1: Coin mechanism diagram.
By the end of the experiment, there will be a database
with answers from all the subjects, but it is expected
that 50% (assuming that the coin has a 50% chance of
getting each result) were artificially generated. So, if
we look at the answer of a single person, there will be
no certainty if that was the true answer.
At the same time, if we subtract 25% of the total
answers with the answer Yes and 25% of the total an-
swers with the answer No, we can have a clear view
of the percentage of the population that make use of
illegal drugs. It was possible, concomitantly, to have a
statistically accurate result (assuming that there were
enough people involved in the study) and preserve ev-
eryone’s privacy.
2.2 Differential Privacy
The basic structure of a DP method consists of a
mechanism that has the not privatized data as input,
and outputs the privatized data. DP establishes con-
straints that this mechanism must conform to, in order
to limit the privacy impact of individuals whose data
is in the dataset.
ICEIS 2021 - 23rd International Conference on Enterprise Information Systems
878