Private Web Search with Constant Round Efficiency
Bolam Kang, Sung Cheol Goh and Myungsun Kim
Department of Information Security, The University of Suwon, Hwaseong, 445-743 South Korea
Keywords:
Private web search (PWS), Secret sharing, Public-key encryption, Round efficiency.
Abstract:
Web searches are increasingly becoming essential activites because they are often the most effective and
convenient way of finding information. However, a web search can be a threat to the privacy of users be-
cause their queries may reveal sensitive information. Private web search (PWS) solutions allow users to find
information on the Internet while preserving their privacy. According to their underlying technology, exist-
ing PWS solutions can be divided into three types: Proxy-based solutions, Obfuscation-based solutions, and
Cryptography-based solutions. Among them, cryptography-based PWS (CB-PWS) systems are particularly
interesting because they provide strong privacy guarantees.
In this paper, we present a constant-round CB-PWS protocol that preserves computational efficiency com-
pared to known CB-PWS systems. To prove these arguments, we first analyze the efficiency of our protocol.
According to our analysis, our protocol simply requires 3n modular exponentiations for n users. In particu-
lar, our protocol is a 5-round protocol that requires O(n) communication complexity. In addition, evaluating
the security of our protocol shows that our construction is comparable to similar solutions in terms of user
privacy.
1 INTRODUCTION
A private web search (PWS) prevents web search ser-
vice providers (e.g., Google and Yahoo) from build-
ing user profiles while still allowing users to enjoy the
search functionality when performing web searches.
User profiling is usually defined as the process of
implicitly learning a user profile from search engine
queries submitted by the user. Then, to perform user
profiling, web search service providers use a user pro-
file to classify a given user into predefined user seg-
ments (e.g., by demographics or tastes) or to cap-
ture the online behavior of the user, including the
users private interests and preferences. This raises
privacy concerns because sensitive information, such
as a user’s name and location, can be inferred from
search engine queries. Aside from the query terms,
other information such as the source IP address and
timestamp may reveal sensitive information about the
user.
Various approaches (e.g., (Elovici et al., 2006;
Saint-Jean et al., 2007; Domingo-Ferrer et al., 2009;
Lindell and Waisbard, 2010; Romero-Tris et al., 2011;
Kim and Kim, 2012)) have been proposed to address
this problem. In these systems, the main measure of
efficiency is the round complexity, and it is impor-
tant to construct constant-round PWS systems while
guaranteeing privacy. In some cases, PWS schemes
without strong privacy guarantees may suffice, and
we know how to construct such protocols, e.g., us-
ing a proxy. However, in this work, we focus on
cryptography-based PWS (CB-PWS) systems, where
strong privacy is another important design goal.
To our knowledge, known CB-PWS construc-
tions require O(n) rounds, where n is the number of
users (Castell
`
a-Roca et al., 2009; Lindell and Wais-
bard, 2010; Romero-Tris et al., 2011), or significantly
restrict the length of messages to be encrypted and
hence do not lead to practical solutions to the prob-
lem (Kim and Kim, 2012). In addition, the latter
requires web search engines to implement and run
the protocols. Search engines, however, do not have
any incentives to implement costly protocols that they
cannot profit from. Unfortunately, there are no known
constructions of practical constant-round CB-PWSs.
We briefly survey what is known in this regard
about a private web search. We then summarize our
contributions and provide a high-level overview of
our construction.
1.1 Literature Review
Similar to (Balsa et al., 2012), we also believe that it
is convenient to classify existing PWS solutions into
205
Kang B., Goh S. and Kim M..
Private Web Search with Constant Round Efficiency.
DOI: 10.5220/0005225602050212
In Proceedings of the 1st International Conference on Information Systems Security and Privacy (ICISSP-2015), pages 205-212
ISBN: 978-989-758-081-9
Copyright
c
2015 SCITEPRESS (Science and Technology Publications, Lda.)
three categories. One category is based on a proxy
server, called a proxy-base PWS solution; the sec-
ond uses various randomizing techniques and is thus
called an obfuscation-based PWS solution; and the
third is a cryptography-based PWS solution. In the
following, we provide somewhat detailed descriptions
of each system. We begin with a proxy-based PWS
scheme.
Proxy-based PWS: The first approach is through
the use of an anonymous proxy (e.g., (Anonymizer,
2014; Scroogle, 2014; Saint-Jean et al., 2007)). Users
can expect that anonymizers will prohibit the creation
of user profiles through query unlinkability. There are
several options in this category, from simple mech-
anisms achieving a low level of anonymity in web
searches to more reliable but more complicate sys-
tems based on onion routing (Reed et al., 1998), such
as the Tor network (Dingledine et al., 2004). How-
ever, the effectiveness of simple solutions is clearly
limited. In addition, as highlighted by (Castell
`
a-Roca
et al., 2009), Tor cannot be installed and configured
with relative ease. Further, it is well known that the
HTTP requests over Tor can become very slow (Saint-
Jean et al., 2007). For example, it takes 10 seconds on
average to submit a query to Google even when using
paths of length 2 (the default length is 3).
Obfuscation-based PWS: Another approach to
providing privacy during web search is based
on a query obfuscation technique (e.g., (Elovici
et al., 2006; Domingo-Ferrer et al., 2009; Rebollo-
Monedero and Forn
´
e, 2010; TracMeNot, 2014)).
Roughly speaking, a class of solutions using query
obfuscation involves blending the real queries into a
stream of fake queries so that web search engines can-
not create a correct profile. From a privacy point of
view, these obfuscation-based solutions have a criti-
cal drawback: automated queries have features that
are different from the actual queries entered by a user,
such as randomness. The authors in (Peddinti and
Saxena, 2010) demonstrated a concrete classifier that
can distinguish real queries from fake queries gener-
ated by TrackMeNot (TracMeNot, 2014), with a mean
misclassification rate of only approximately 0.02%.
Cryptography-based PWS: The last class of solu-
tions involves using cryptographic algorithms, such
as public-key encryption and shuffle. One of the main
advantages of CB-PWSs over other approaches is that
they provide strong privacy guarantees. In addition,
they are not affected by the misclassification issue and
are generally faster than anonymizer-based solutions.
To our knowledge, known solutions can be found
in (Castell
`
a-Roca et al., 2009; Lindell and Waisbard,
2010; Romero-Tris et al., 2011; Kim and Kim, 2012).
Without loss of generality, we may consider Romero-
Tris et al.s scheme as a malicious variant of Castell
´
a-
Roca et al.s scheme. Thus, the difference between
two schemes does not effect the round complexity.
The authors in the above references utilize the ba-
sic idea that upon joining a small-sized group, each
user encrypts his search query and sends it to other
members. Then, according to a predefined order, one
user provides a shuffled list of encrypted queries to his
neighbor. Finally, the last user broadcasts its shuffled
version. After group decryption, each user obtains a
set of queries, but he can not know who submitted
which query. As a result, web search engines cannot
build user profiles.
Further, we would like to note that the only ap-
proach that comes close to achieving our require-
ments in the restricted setting is the work by Kim et
al. (Kim and Kim, 2012). The authors proposed a
PWS scheme based on the notion of decomposable
encryption. However, this approach significantly re-
stricts the length of plaintexts (e.g., to 3 or 4 bits) to
be encrypted and hence does not lead to practical so-
lutions to the problem. Later, we provide a detailed
evaluation and analysis of existing CB-PWS solutions
(see Section 4).
1.2 Our Contributions
Our main contribution is that the first practical pro-
tocol has only O(n) modular exponentiations and a
constant number of rounds at the user side, where n
is the number of users (i.e., the group size). Accord-
ing to our analysis of existing CB-PWS schemes (i.e.,
(Castell
`
a-Roca et al., 2009; Lindell and Waisbard,
2010; Romero-Tris et al., 2011)), existing solu-
tions require O(n) computation complexity and O(n)
rounds under the same conditions.
Assume that a user has words or sentences for a
query and is about to submit the query to a specific
search engine. Using our protocol, the user first dis-
perses his query term into n pieces using Shamir’s
secret sharing scheme. This can be very efficiently
performed in a finite field (see Section 4.1). Then, en-
crypting each share into n ciphertexts under a public-
key cryptosystem, which requires at most O(n) mod-
ular exponentiations, each user broadcasts the en-
crypted shares to the corresponding users. Finally, all
users send a list of re-masked and shuffled ciphertexts
to a group manager. (In CB-PWS solutions, a small
group of users is created and maintained by a specific
entity called a group manager. We will explain the
entity later.)
ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy
206
Our key technical contribution is distributing a
query term instead of a private key among n users.
Roughly speaking, known CB-PWS schemes demand
that each user computes only a single ciphertext of
the query term, but a collection of all n ciphertexts
should be shuffled in a relay manner among all users.
This step is essential to provide unlinkability between
query terms and users, but it leads to O(n) round com-
plexity. Our scheme does not require users to partici-
pate in sequential shuffling.
1.3 Our Setting
We work in a setting consisting of the following three
semi-honest entities:
Users. The users are the individuals who submit
query terms to the search engine and who wish
to prevent the search engine from building their
profiles. We use u to denote a user.
The group manager. The role of the group man-
ager, denoted by G, is to group users so that they
execute our protocol that was introduced as above.
We assume that the group manager has fairly pow-
erful computing resources and storage capacity
compared to the users.
The search engine. The web search service
provider, denoted by W , is the entity that provides
a list of best-matching web pages, usually along
with a short summary and/or, sometimes, parts of
the document. A typical example is Google. Note
that the search engine has no incentives to protect
users’ privacy.
We consider that an adversary is not allowed
to break current computationally secure encryption
schemes. We assume that there are at least two honest
users. However, in contrast to (Castell
`
a-Roca et al.,
2009), we allow collusions between two entities of
the protocol.
1.4 A High-level Overview of Our
Solution
In what follows, we provide a high-level description
of our construction and the techniques used therein.
To obtain our construction, we build on the notion of
secret sharing. The basic idea of Shamir’s (t,n)-secret
sharing is that a user can use a polynomial f (X) of de-
gree t 1 to split a secret q into n shares: (v
1
,...,v
n
).
Then, any collection of shares t from the distributed
n shares allows one to recover the secret using the La-
grange interpolation formula.
The starting point of the design of our solution is
Shamir’s secret sharing scheme: To submit a query
term q, the user chooses a random polynomial f (X)
of degree n 1 whose constant term is q and evalu-
ates v
i
= f (u
i
) at each user’s label u
i
. Then, each user
computes encryptions ¯v
i
= E
pk
(v
i
) for all 1 i n.
Here, E
pk
(·) is a public-key encryption algorithm, and
v
i
is assumed to be in the message space of the en-
cryption algorithm. Next, each user broadcasts all en-
crypted shares to other users.
After all users obtain a list of encrypted shares,
they perform re-masking and permutation of the list
and send it to a group manager without changing the
original plaintexts of the ciphertext list. Using the
homomorphic property of the underlying public-key
encryption, we can efficiently and successfully re-
encrypt ciphertexts. For the details of the homomor-
phism, see Section 2.2.
At first glance, our scheme may seem to be the
same as existing CB-PWS schemes. However, we
emphasize that in the above step of our scheme, each
user has a list of ciphertexts that are different from
all other users’ lists, and thus, shuffling ciphertexts
does not demand any interaction between neighbors.
To the contrary, every user in existing solutions ob-
tains the same list of ciphertexts. This requires every
user to join in the sequential shuffles so that the pro-
tocol can achieve unlinkability. We consider this as a
very legitimate reason to incur a high round complex-
ity O(n). However, our solution results in all users
having different lists of ciphertexts so that the users
do not need to perform shuffles sequentially.
To decrypt all received lists of ciphertexts, the
group manager uses Lagrange interpolation to recover
the users’ query terms. The group manager then sub-
mits the recovered queries to the search engine and
broadcasts the search results to the group users.
Outline of the Paper: This work is organized as
follows. Section 2 introduces cryptographic build-
ing blocks: secret sharing and public-key encryption.
Section 3 provides a detailed description of our con-
struction. In Section 4, we provide the construction’s
performance and a security analysis.
2 BACKGROUND
In this section, we review the concepts and notation
of cryptographic building blocks. We begin with a
review of secret sharing techniques and then recall
public-key cryptography and its security definitions.
Notation: For n N, [n] denotes the set {1,...,n}.
If A is a probabilistic polynomial-time (PPT) ma-
chine, we use a A to denote making A produce an
PrivateWebSearchwithConstantRoundEfficiency
207
output according to its internal randomness. In par-
ticular, if U is a set, then r
$
U is used to denote
sampling from the uniform distribution on U.
We denote by λ a security parameter. A function
g : N R is called negligible if for every positive
polynomial µ(·), there is an integer N such that g(n) <
1/µ(n) for all n > N. We use standard asymptotic O
notation to denote the growth of positive functions.
2.1 Secret Sharing
A secret sharing scheme is a method of distributing
a secret, usually a key, among a group of users, re-
quiring a cooperative effort to determine the key, so
that the plaintext can subsequently be decrypted. The
ultimate goal of the scheme is to divide the secret be-
ing hidden into n shares but whereby any subset of
t shares can be used together to solve for the value
of the secret. Additionally, any subset of t 1 shares
will prevent the secret from being reconstructed. This
is defined as a (t,n)-threshold scheme, meaning that
the secret is dispersed into n overall pieces, with any
t pieces being able to recreate the original secret.
In this work, we use Shamir’s secret sharing
scheme (Shamir, 1979). Shamir’s scheme is based
on polynomial interpolation and takes t points on the
Cartesian plane; using those t points, a unique poly-
nomial f (X) is guaranteed to exist such that f (X) = y
for each of the points given. Regardless, this polyno-
mial f (X) is of degree t 1, and the coefficient for the
0
th
degree is equal to a given secret q. Overall, the full
equation for f (X) is given as follows, with q = a
0
:
f (X) = a
0
+ a
1
X +··· + a
t1
X
t1
.
2.2 Public-key Encryption
A public-key encryption scheme E = (KG,E,D) con-
sists of the following algorithms:
KG is a randomized algorithm that takes a security
parameter λ as input and outputs a secret key sk
and a public key pk; pk defines a plaintext space
M
pk
and a ciphertext space C
pk
.
E is a randomized algorithm that takes pk and a
plaintext m M
pk
as input and outputs a cipher-
text c C
pk
. Note, this process is usually random-
ized using a randomization value r R
pk
c = E
pk
(m;r)
D takes sk and c C
pk
as input and outputs the
plaintext m.
We say that an encryption scheme is correct if, for
any key-pair (pk, sk) KG(1
λ
) and any m M
pk
,
m D
sk
(E
pk
(m)).
We say that a public-key cryptosystem E =
(KG,E,D) is homomorphic for the binary relations
(,) if for all (pk,sk) KG(1
λ
):
Given the message domain M
pk
, (M
pk
,) forms
a group.
Given the ciphertext range C
pk
, (C
pk
,) forms a
group.
For all c
1
,c
2
C
pk
, D
sk
(c
1
c
2
) = D
sk
(c
1
)
D
sk
(c
2
).
As a consequence, a cryptosystem’s homomor-
phic property allows it to perform reencryption: given
a ciphertext c, anyone can create a different ciphertext
˜c that encodes the same plaintext as c. Thus, given a
homomorphic encryption scheme E, we can define
the reencryption algorithm as follows:
RE
pk
(c;r) = c E
pk
(m
0
;r)
where m
0
is an identity message such that m
M
pk
,m m
0
= m. Further, D
sk
(c) = m, and then,
D
sk
(RE
pk
(c)) = m, too.
Remark 1. Herein, we will simply assume that we
have some secret sharing scheme with the key K and
a public-keycryptosystem with the key pk. The keys
may or may not overlap, whereby elements from one
key are also included in another key. For instance,
we could imagine that the cryptosystem was an ElGa-
mal scheme working on a group G of order q from
primes p,q|p 1 and a generator g and that the se-
cret sharing scheme used the same message space as
the encryption scheme, i.e., M
K
= M
pk
.
3 OUR CONSTRUCTION FOR
PRIVATE WEB SEARCHING
We now describe our new technique for private web
search in the semi-honest model. Our scheme enjoys
a computational efficiency that is similar to existing
CB-PWS schemes but does not require rounds in pro-
portion to the number of users.
3.1 The Proposed Scheme
Our scheme is logically divided into three phases:
Setup, Mixing query, and Submitting query. We pro-
vide an abstract description of our proposal in Fig-
ure 1.
ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy
208
Protocol: Private Web Search
Inputs: For all i [n], the input of u
i
is a query term q
i
.
Auxiliary inputs: a security parameter 1
λ
The protocol actions:
1. Setup. This step consists of two major tasks as follows:
Group setup: A group of n users is created, and the group information is given to the users by
the group manager.
Parameter setup. The group manager generates the parameters for a public-key encryption
scheme and a key for Shamir’s secret sharing scheme and makes them publicly accessible to
the users.
2. Mixing query. In this step, the users perform the following:
(a) Splitting the query term q
i
into n shares using Shamir’s secret sharing scheme.
(b) Encrypting the shares into a list of ciphertexts under the group manager’s public key.
(c) Broadcasting the list of ciphertexts.
(d) Re-encrypting and mixing the list of ciphertexts.
(e) Sending the list of ciphertexts to the group manager.
3. Submitting query. In this step, the group manager recovers all query terms by applying the
Lagrange interpolation technique to the decrypted lists. Then, the group manager submits all
query terms to the search engine.
Figure 1: A High-level Description of Our Protocol.
3.1.1 Setup
Let E = (KG,E,D) be a semantically secure public-
key cryptosystem with the homomorphic property,
and let K be a public key specifying a list of parame-
ters for Shamir’s secret sharing scheme. It also speci-
fies how to efficiently perform polynomial evaluation
and interpolation (see Section 2.1). The Setup phase
is composed of two primary activities:
1. When the group manager G receives n requests for
private query, it responds to all n users by saying
that the group size is n. Then, the group manager
constructs a group {u
1
,u
2
,...,u
n
} and publishes
the group information. Specifically, the group in-
formation may include the group name, a list of
participating users, and each user’s label. We call
this a group setup.
2. The group manager chooses a pair of keys (pk,sk)
by invoking (pk,sk)
$
KG(1
λ
) and selects a
proper key K with the message compatibility. It
then publishes all system parameters (pk,K) for
the protocol. As a result, the users have M
K
=
M
pk
. We call this sub-step parameter setup.
3.1.2 Mixing Query
After obtaining the system parameters from G as a
response to its query request, each user u
i[n]
performs
the following:
1. Upon receiving the parameters, u
i
chooses n 1
random coefficients r
i, j
M
K
and determines
R
i
(X) = r
i,n1
X
n1
+ ··· + r
i,1
X +q
i
where q
i
M
K
is u
i
s query term.
2. Each user computes shares v
i, j
of his query q
i
for
every j [n] by
R
i
( j) =
n1
k=1
r
i,k
j
k
+ q
i
.
We define v
i, j
:= R
i
( j) for all i, j [n].
3. For each j [n], u
i
computes ¯v
i, j
= E
pk
(v
i, j
) and
broadcasts a list of ciphertexts hi, j, ¯v
i, j
i
j[n]\{i}
to
all other users.
4. Assume that each user has the following list of
encrypted shares
···
h1,i, ¯v
1,i
i ··· h1,n, ¯v
1,n
i
h2,1, ¯v
2,1
i ··· h2,i, ¯v
2,i
i ··· h2,n, ¯v
2,n
i
.
.
.
hn,1, ¯v
n,1
i ···
hn,i, ¯v
n,i
i ···
where indicates that a correct encrypted share
is unknown to the corresponding cell.
In the above list, because the i-th column is com-
plete, the user u
i
sets
¯
V
i
= (¯v
1,i
,..., ¯v
n,i
). Then,
the user computes a new version of the list
˜
V
i
=
PrivateWebSearchwithConstantRoundEfficiency
209
(˜v
1,i
,..., ˜v
n,i
), where ˜v
`,i
= RE
pk
¯v
π
i
( j),i
for all
`, j [n] and for a random permutation π
i
over
[n].
5. Each user sends
˜
V
i
to the group manager.
3.1.3 Submitting Query
The group manager performs the following steps:
1. The group manager constructs the following n ×n
matrix M by decrypting all of ciphertexts in the n
vectors:
M =
˜
V
1
˜
V
2
.
.
.
˜
V
n
=
v
π
1
(1),1
··· v
π
1
(n),1
v
π
2
(1),2
··· v
π
2
(n),2
.
.
.
.
.
.
.
.
.
v
π
n
(1),n
··· v
π
n
(n),n
2. Because G has no order information for each row,
it sequentially recovers each query term by apply-
ing the Lagrange interpolation formula to each el-
ement in M.
3. The group manager submits a set of query terms to
the search engine W. The group manager broad-
casts the output from W .
Remark 2. We have some options regarding the re-
covery of query terms. The simplest option that the
group manager can choose is for the group manager
to send all query terms reconstructed using the La-
grange interpolation formula in the Submitting Query
step. This technique clearly ensures that all of the n
original query terms are correctly determined. How-
ever, its major disadvantage is that it incurs high com-
putation complexity.
The alternative, albeit controversially, is to use
a smart engine that can check if each reconstructed
query term is in a dictionary maintained by the en-
gine.
4 ANALYSIS
This section analyzes the performance of our con-
struction in terms of the efficiency requirements.
Next, we analyze the security of our protocol by ex-
amining all behaviors of the protocol.
4.1 Performance Analysis
Our protocol is compared to other CB-PWS solutions
in terms of three efficiency measures: computation,
communication and rounds. For this purpose, we
first analyze the performance of our proposal. Then,
the schemes proposed by (Castell
`
a-Roca et al., 2009)
and (Lindell and Waisbard, 2010) are compared to
our proposal. We do not know how to provide a fair
comparison between the scheme proposed by (Kim
and Kim, 2012) and our scheme because our scheme
allows query terms that are n-times longer than the
maximum length of query terms allowed in (Kim and
Kim, 2012).
Parameter Selection: Before conducting compar-
isons, we first need to determine the system parame-
ters, the group size (n) and the key size.
We believe that the time users must wait to form
the group determines the group size n. Quickly cre-
ating the group makes it possible to reduce the query
delay. However, the larger the the group, the more pri-
vacy the protocol achieves. Therefore, one way that
the members can obtain strong privacy is a scheme
whereby a user joins a different small-sized group
every time the user submits a query. According
to (Castell
`
a-Roca et al., 2009), n = 3 is the most real-
istic group size in practice. The authors in (Castell
`
a-
Roca et al., 2009) stated that, with an overwhelming
probability, a group of n = 3 users can be created in a
hundredth of a second.
Regarding the key length, we take a 1024-bit key
length, as in other solutions. Thus, a 1024-bit key can
encrypt up to 128 bytes at a time; therefore, a public-
key cryptosystem that uses a 1024-bit key length can
address queries of approximately 64 characters.
Computation Complexity: Now, we analyze the
computation cost for running the protocol. In gen-
eral, because it is widely accepted that modular ex-
ponentiations dominate the total computation cost of
a system, we also focus on the number of modular
exponentiations that every user must perform during
execution of each protocol. For a fair comparison, we
assume that our construction also employs an ElGa-
mal encryption scheme.
For this purpose, we denote by ME(`) a modu-
lar exponentiation modulo of an `-bit integer value.
The scheme in (Lindell and Waisbard, 2010) exten-
sively uses a double encryption by combining ElGa-
mal encryption and Cramer and Shoup’s cryptosys-
tem (Cramer and Shoup, 1998). Thus, some modu-
lar exponentiations in (Lindell and Waisbard, 2010)
should be carried out modulus a 2048-bit integer
rather than a 1024-bit integer, as in (Castell
`
a-Roca
et al., 2009) and in our scheme. Our experimental
implementation without any optimization shows that
a 1024-bit modular exponentiation is approximately
10 times faster than a 2048-bit modular exponentia-
tion. Specifically, a 1024-bit modular exponentiation
ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy
210
Table 1: Complexity Comparison at the User Side.
Our scheme Castell
´
a-Roca et al.s scheme Lindell and Waisbard’s scheme
Computation
3n · ME(1024) (3n + 3) · ME(1024) (n + 3) · ME(1024) + 11n · ME(2048)
Complexity
Communication
4n 3n 2 4n 2
Complexity
Round
5 n + 6 n + 6
Complexity
takes 18 msec on average, whereas a 2048-bit modu-
lar exponentiation takes 191 msec on average.
We provide a comparison of schemes in the first
row of Table 1 using the computation complexity. We
remark that the evaluation of a polynomial in a finite
field of degree less than n at n points can be performed
using at most O(M(n)logn) field operations, where
M(n) is the number of bit operations for multiplying
two n-bit integers. Similarly, Lagrange interpolation
can be computed with the asymptotically same com-
putation.
We observe that our scheme obtains the lowest
computation cost from the user’s point of view. Of
course, our scheme and that in (Castell
`
a-Roca et al.,
2009) do not provide any mechanism to protect the
honest users against malicious adversaries. How-
ever, even if we use zero-knowledge proofs to achieve
active security, we conjecture that our scheme still
has O(n) computation complexity, as in (Lindell and
Waisbard, 2010).
Remark 3. In our scheme, the group manager has
to perform somewhat heavy computations, while the
group manager in other existing CB-PWS protocols
only plays a role in maintaining a group of users.
However, in practice, because n is small (e.g., n = 3 or
4), we believe that these computations may not con-
siderably affect the whole performance.
Communication and Round Complexity: First, it
is clear that our protocol only incurs a constant num-
ber of rounds. Next, we compare the communication
complexity by counting the number of messages that
every user should send in each step of the protocols.
Table 1 summarizes the comparison results.
Our proposal consists of five rounds: (1) obtain-
ing the system parameters, (2) applying Shamir’s se-
cret sharing and encryption to a query term and broad-
casting a resulting list, (3) shuffling the resulting set
and sending it to the group manager, (4) recovering a
set of query terms and submitting to a search engine,
and, finally, (5) broadcasting the search results to the
users. All CB-PWS schemes have O(n) communica-
tion complexity. In particular, our scheme only re-
quires 5 rounds, while other CB-PWS proposals have
O(n) round complexity.
4.2 Security Analysis
Our construction achieves the following privacy re-
quirements of the users when they submit query terms
to a search engine:
Unlinkability Among Users: Let J [n] be a set of
semi-honest users such that |J| = η < n. For notional
convenience, suppose that ˆu J is of the index α [n]
but that it follows all steps of the protocol. At the
end of the Mixing Query step, ˆu receives a list of en-
crypted shares from other users: (¯v
1,α
, ¯v
2,α
,..., ¯v
n,α
).
If ˆu would be able to decrypt these ciphertexts, it
would be able to fill in some parts of the matrix M.
Nevertheless, as long as M is not completed, ˆu can-
not know the queries of honest users. For example,
let u
1
,u
2
be only the honest users. Denoting by a
decrypted share, ˆu would be able to obtain at most the
following matrix:
? · · ·
? ·· ·
···
.
.
.
···
Hence, ˆu is unable to obtain q
1
and q
2
, and thus, our
solution preserves the users’ privacy.
Unlinkability between the Group Manager and the
Users: Our protocol allows two entities (i.e., be-
tween ˆu and G) to collude. In the middle of the
Submitting Query phase, a compromised group man-
ager and semi-honest users have the valid matrix, but
they still do not know the secret permutations of hon-
est users. Therefore, even though the group manager
PrivateWebSearchwithConstantRoundEfficiency
211
would be compromised, the attacker could only ran-
domly guess that a “link” is correct with a probability
that is only negligibly greater than
1
nη
.
Unlinkability between the Search Engine and the
Users: Our protocol allows the search engine to par-
ticipate in the execution of the protocol, except only at
the end of the last phase. For reasons similar to those
above, it cannot link a certain query to an honest user;
therefore, it cannot build profiles for honest users.
5 CONCLUDING REMARKS AND
FURTHER RESEARCH
In this work, we presented a constant-round CB-PWS
protocol for protecting users’ privacy. Our solution
can be easily deployed in current systems because it
does not require any changes on the service provider
side. However, the following work remains for further
research:
We will try to provide a more rigorous security
proof using standard techniques, such as simula-
tion or game-playing proof. For this purpose, we
first need to precisely define the notion of privacy
in this setting.
We should improve the performance of the group
manager side, especially when the size of the
group is large.
ACKNOWLEDGEMENTS
This research was supported by the MSIP(Ministry
of Science, ICT and Future Planning), Korea, under
the Specialized Co-operation between industry and
academic support program (NIPA-2014-H0808-14-
1003) supervised by the NIPA(National IT Industry
Promotion Agency). Myungsun Kim was supported
by Basic Science Research Program through the Na-
tional Research Foundation of Korea (NRF) funded
by the Ministry of Education (2014R1A1A2058377).
REFERENCES
Anonymizer (2014). Anonymizer. http://www.anonymizer.
com.
Balsa, E., Troncoso, C., and D
´
ıaz, C. (2012). OB-PWS:
Obfuscation-based private web search. In IEEE Sym-
posium on Security and Privacy, pages 491–505.
Castell
`
a-Roca, J., Viejo, A., and Herrera-Joancomart
´
ı, J.
(2009). Preserving user’s privacy in web search en-
gines. Computer Communications, 32(13-14):1541–
1551.
Cramer, R. and Shoup, V. (1998). A practical public key
cryptosystem provably secure against adaptive chosen
ciphertext attack. In Krawczyk, H., editor, Advances
in Cryptology-Crypto, LNCS 1462, pages 13–25.
Dingledine, R., Mathewson, N., and Syverson, P. (2004).
Tor: The second-generation onion router. In Blaze,
M., editor, USENIX Security Symposium, pages 303–
320.
Domingo-Ferrer, J., Solanas, A., and Castell
`
a-Roca, J.
(2009). h(k)-private information retrieval from
privacy-uncooperative queryable databases. Online
Information Review, 33(4):720–744.
Elovici, Y., Shapira, B., and Meshiach, A. (2006).
Cluster-analysis attack against a private web solution
(PRAW). Online Information Review, 30(6):624–643.
Kim, M. and Kim, J. (2012). Privacy-preserving web
search. In ICUFN, pages 480–481.
Lindell, Y. and Waisbard, E. (2010). Private web search
with malicious adversaries. In Atallah, M. and Hop-
per, N., editors, Privacy Enhancing Technologies,
LNCS 6205, pages 220–235.
Peddinti, S. T. and Saxena, N. (2010). On the privacy of web
search based on query obfuscation: A case study of
TrackMeNot. In Atallah, M. and Hopper, N., editors,
Privacy Enhancing Technologies, LNCS 6205, pages
19–37.
Rebollo-Monedero, D. and Forn
´
e, J. (2010). Optimized
query forgery for private information retrieval. IEEE
Transactions on Information Theory, 56(9):4631–
4642.
Reed, M., Syverson, P., and Goldschlag, D. (1998). Anony-
mous connections and onion routing. IEEE Journal on
Selected Areas in Communications, 16(4):482–494.
Romero-Tris, C., Castell
`
a-Roca, J., and Viejo, A. (2011).
Multi-party private web search with untrusted part-
ners. In Rajarajan, M., Piper, F., Wang, H., and Ke-
sidis, G., editors, SecureComm, pages 261–280.
Saint-Jean, F., Johnson, A., Boneh, D., and Feigenbaum, J.
(2007). Private web search. In Ning, P. and Yu, T.,
editors, WPES, pages 84–90.
Scroogle (2014). Scroogle, http://scroogle.org.
Shamir, A. (1979). How to share a secret. Communications
of the ACM, 22(11):612–613.
TracMeNot (2014). TracMeNot, http://mrl.nyu.edu/dhowe
/trackmenot.
ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy
212