belong to different privacy issues.
Claim 2: The privacy issue dominance is a
transitive relation, i.e. X > Y and Y > Z then X > Z,
where X, Y and Z are privacy issues over a common
privacy protection or over different protections.
Proof: Let p, q and r be three protections of privacy
issues X, Y and Z respectively. X > Y => p > q in
X.Y and Y > Z => q > r in Y.Z. By Claim 1, Y > Z
=> p > r in Y.Z. Again p and r are protections of
X.Z. By definition of privacy issue dominance we
can conclude X > Z.
Valid Privacy Types: Let b
,…,b
k 2, are k
privacy issues w.r.t. to a common privacy protection.
Alternatively, let b
1
, ..., b
k
be the k privacy
protections over the domain of one or more privacy
issues (which have been joined). For the joint
protection domain (b
1
, ..., b
k
) total number of
possible privacy type is 2
k
. We can represent these
types by binary strings of length k, i.e. k-bit strings
whose values range from 0 to 2
k
– 1.
Claim 3: Let b
1
> ... > b
k
hold for the joint
protection domain (b
1
, ..., b
k
). Then, there are only k
+ 1 valid privacy types {1
k
, 0
1
1
k-1
, ..., o
k-1
1
1
, 0
k
},
equivalently {2
k
– 1, 2
k-1
– 1, ..., 2
0
– 1} out of a total
possible 2
k
types.
Proof: We prove the claim by induction. If b
1
> b
2
the valid types are {11, 01, 00} or, {1
2
, 01, 0
2
}. This
proves that the claim holds for k=2. Let the claim
hold for k = i. We will show that it holds for k = i+1.
The case k = i indicates that b
1
> ... > b
j
. Assume, b
i
> b
i+1
. So, in the combined privacy (b
1
, ..., b
i
, b
i+1
)
obtained by joining (b
1
, ..., b
i
) and (b
i
, b
i+1
), we have
b
1
> .... b
i
> b
i+1
. For k = i, we get i+1 valid privacy
binary strings. For k = i + 1, only one bit will be
introduced in the right hand side of each string. We
call this bit parity bit. Thus, there are i+1 strings
with parity bit 0 and i+1 strings with parity bit 1.
Since, b
i
> b
i+1
we can discard the strings having
parity bit 1 except the strings of all 1s. Thus, the
number of valid privacy strings will be i + 1 number
of strings having parity bit 0 plus the string with all
1s. Clearly, the combined privacy has i + 2 privacy
types {2
i+1
– 1, 2
i
– 1, ..., 2
1
– 1, 2
0
– 1}. This
completes the proof.
5 PRIVACY ALGEBRA AND CIS
In this section we demonstrate how privacy algebra
can be used to simplify and consolidate the privacy
issues of CIS. First we notice that successive
applications of join of elementary privacy issues are
not affected by the sequence in which the operands
are selected for the join operation. For example,
A.(B.C) = (A.C).B = C.(A.B) = B.(C.A). We will
detect the interdependencies in the form of
dominance relations between privacy issues over
privacy protections or vice versa. Let us allow all
possible communications between any two parties
out of n+2 parties, C, SP, DS
1
, ..., DS
n
, i.e. no
symmetry among DSs are assumed. We consider
five privacy issues: I (identity), S (schema), D
(data), Q (query) and R (result). The query
distribution (Qd) issue is fixed at (No, No), i.e.
Open-Qd issue (Table 2). For each of the five
privacy issues there are (n+2)(n+1) one way
accesses. Therefore, total number of possible
privacy types for all five issues is 2
5(n+2)(n+1)
. With
increasing number of participants or large value of n
it may not be practical to implement so many
privacy types. We therefore look into reduction of
allowable privacy types considering dominance
relations of different privacies (Section 4).
Let us consider the privacy protection between
any two DS across different privacy issues. Note, for
this protection, the following dominance holds: I > S
> D, I > R and I > S > Q because without access to
identity one cannot have access to schema. In other
words, when identity is protected schema cannot
remain unprotected. But when identity is
unprotected schema can be protected or unprotected.
Similarly, without learning the respective schema
learning data would not be possible, but for learning
the result part the knowledge of schema need not be
essential. However, the query part would require the
knowledge of the respective schema. Coming to the
privacy types, for I > S we have 3 valid privacy
types {11, 01, 00}. Similarly, for S > D we also have
3 valid privacy types {11, 01, 00}. By Claim 3,
joining I > S with S > D we have 4 valid privacy
types {111, 011, 001, 000} = {2
3
– 1, 2
2
– 1, 2
1
– 1,
2
0
– 1} for the dominance I > S > D shown in Table
4. Similarly, for I > S > Q, we also have 4 valid
privacy types {111, 011, 001, 000}. By joining
privacy types of I > S > D with those of I > S >
Q the set of valid privacy types for I, S, Q, and D are
shown in Table 5. Joining I > R having 3 privacy
types {11, 01, 00} with (I>S>D).(I>S>Q) with 11
types (Table 6).
Thus by applying the dominance relations we
have been able to reduce the number of privacy
options between two data sources from 2
5
= 32 to
11. Similarly, considering the protection of DS from
C, given relations I > S > D and I > R we get 7
privacy types (Table 7). Protection C from a DS
involves only identity privacy, and has two types:
“*”. Protection DS from SP involves data and result
privacies which are independent of each other, hence