Table 1: Notation of BP, secrets, and discriminative pairs.
(cont.)
Symbol Description
(s,s
0
) ∈ S ×S A pair of secrets, e.g. (Example1, Example2)
A discrimi-
native pair of
secrets (s,s
0
)
A mutually exclusive pair of secrets. Two
statements that cannot be true at the same
time. An adversary must not be able to dis-
tinguish which one is true and which one
is false, e.g. (t. id = ’Bob’ ∧ t = x, t. id =
’Bob’ ∧t = y)
s
i
x
The secret t. id = i ∧t = x where x ∈ T , e.g.
s
’Bob’
(’cancer’,65)
S
pairs
A set of discriminative pairs of secrets, e.g.
S
full
pairs
, S
attr
pairs
, S
P
pairs
, S
d,θ
pairs
S
G
pairs
A set of discriminative pairs of se-
crets based on graph G(V,E), i.e.
{(s
i
x
,s
i
y
)|∀i,∀(x, y) ∈ E}
Full domain:
S
full
pairs
For every individual, the value
is not known to be x or y, i.e.
{(s
i
x
,s
i
y
)|∀i,∀(x, y) ∈ T × T }
Attributes:
S
attr
pairs
For every individual and every two tu-
ples differing in the value of only one
attribute A where one of them is real,
the real tuple is not known. The privacy
definition is weaker than in full domain
S
full
pairs
since the real tuple is distinguish-
able if more than one attribute differs, i.e.
{(s
i
x
,s
i
y
)|∀i,∃A, x[A] 6= y[A] ∧ x[
¯
A] = y[
¯
A]}
Partitioned:
S
P
pairs
For every individual and every two tuples
coming from the same partition where one
of them is real, the real tuple is not known,
i.e.
{(s
i
x
,s
i
y
)|∀i,∃ j,(x , y) ∈ P
j
× P
j
}.
This
privacy definition is very useful for location
data.
Distance
threshold:
S
d,θ
pairs
For every individual and every two tuples
having their distance less than or equal
to a threshold θ where one of them is
real, the real tuple is not known, i.e.
{(s
i
x
,s
i
y
)|∀i,d(x,y) ≤ θ}
In this direction, the idea of a discriminative secret is
very similar to what consists a game in cryptography.
We prefer to call it a privacy game here and represent
it as shown in Figure 2. In this game, a challenger
picks an Id (e.g. Bob) and a pair of discriminative se-
crets at random (e.g. ”Bob has called Alice” or ”Bob
has not called Alice”). The pair is represented by two
tuples, or an edge in the discriminative secret graph
of the Id. The edge vertices identify the two tuples.
The challenger sends the Id and the two tuples to the
adversary (e.g. which one does belong to Bob?). The
adversary has to guess which of the two tuples be-
longs to the id and responds with only 1 bit b. b = 0
is chosen for t
0
and b = 1 for t
1
.
Our goal is to make the probability of the adver-
sary guessing the assumed right tuple not significantly
different than a coin flip.
An important remark about undirected communi-
Table 2: Examples of notions of secrets for a communica-
tion database.
Symbol Description - Example
Secret: s
Bob has talked to Alice: t
i
. id = ’Bob’ ∧
t
j
. id = ’Alice’ ∧t
i
[ j] = t
j
[i] = 1
A discrimi-
native pair of
secrets (s,s
0
)
Given two communication tuples (ego net-
works), we cannot distinguish which one of
them belongs to Bob, for example, (t. id =
’Bob’ ∧ t = (0,1, 1,1), t. id = ’Bob’ ∧ t =
(0,0, 0,1))
s
i
x
The secret where individual i has ego net-
work x, for example, s
’Bob’
(0,1,1,1)
S
full
pairs
For an individual, all ego networks are dis-
criminative
S
attr
pairs
For an individual and two vectors that differ
in only one communication, we cannot tell
which one is real.
S
P
pairs
For an individual and two tuples belonging
to the same partition, we cannot tell which
tuple is the real one.
S
(d,θ)
pairs
Given a distance metric and a threshold. The
privacy game is to challenge the adversary
with one individual and two records having
their distance less than or equal to thresh-
old. A suitable distance for communication
graphs is the Hamming distance (or the num-
ber of different bits), which is equivalent to
the Manhattan distance in this case.
cation graphs is that not all the graphs are feasible. If
Bob has talked to Alice, it means that Alice has talked
to Bob. The database matrix is symmetric. Another
constraint is that t
i
[i] must be 0, and all other entries
are either 0 or 1. The BP framework allows to de-
fine constraints about the dataset, and redefines the
notion of neighborhood databases by excluding inter-
mediate, yet infeasible ones. Therefore, we suggest
that BP is a more suitable framework for communica-
tion graphs than its DP predecessor.
2.2 Auxiliary Knowledge
Auxiliary knowledge is usually formalized using cor-
relations, for example c(R = r
1
) + c(R = r
2
) = a
1
where c(r
1
) is the count of records having the attribute
R equal to r
1
, c(r
2
) is the count of records having
the attribute R equal to r
2
, and a
1
is known. BP sug-
gests to formalize auxiliary knowledge in terms of a
set of constraints Q that a database D must satisfy.
It denotes I
Q
⊂ I
n
the subset of all possible database
instances. In the case of undirected communication
graphs, we have two inherent constraints:
• the matrix of D is symmetric: t
i
j
= t
j
i
• the ego attributes are zero: t
i
i
= 0
It is also possible to use directed communication
graphs where a directed edge from Bob to Alice
VIP Blowfish Privacy in Communication Graphs
461