Private Web Search with Constant Round Efﬁciency

Bolam Kang, Sung Cheol Goh and Myungsun Kim

Department of Information Security, The University of Suwon, Hwaseong, 445-743 South Korea

Keywords:

Private web search (PWS), Secret sharing, Public-key encryption, Round efﬁciency.

Abstract:

Web searches are increasingly becoming essential activites because they are often the most effective and

convenient way of ﬁnding information. However, a web search can be a threat to the privacy of users be-

cause their queries may reveal sensitive information. Private web search (PWS) solutions allow users to ﬁnd

information on the Internet while preserving their privacy. According to their underlying technology, exist-

ing PWS solutions can be divided into three types: Proxy-based solutions, Obfuscation-based solutions, and

Cryptography-based solutions. Among them, cryptography-based PWS (CB-PWS) systems are particularly

interesting because they provide strong privacy guarantees.

In this paper, we present a constant-round CB-PWS protocol that preserves computational efﬁciency com-

pared to known CB-PWS systems. To prove these arguments, we ﬁrst analyze the efﬁciency of our protocol.

According to our analysis, our protocol simply requires 3n modular exponentiations for n users. In particu-

lar, our protocol is a 5-round protocol that requires O(n) communication complexity. In addition, evaluating

the security of our protocol shows that our construction is comparable to similar solutions in terms of user

privacy.

1 INTRODUCTION

A private web search (PWS) prevents web search ser-

vice providers (e.g., Google and Yahoo) from build-

ing user proﬁles while still allowing users to enjoy the

search functionality when performing web searches.

User proﬁling is usually deﬁned as the process of

implicitly learning a user proﬁle from search engine

queries submitted by the user. Then, to perform user

proﬁling, web search service providers use a user pro-

ﬁle to classify a given user into predeﬁned user seg-

ments (e.g., by demographics or tastes) or to cap-

ture the online behavior of the user, including the

users private interests and preferences. This raises

privacy concerns because sensitive information, such

as a user’s name and location, can be inferred from

search engine queries. Aside from the query terms,

other information such as the source IP address and

timestamp may reveal sensitive information about the

user.

Various approaches (e.g., (Elovici et al., 2006;

Saint-Jean et al., 2007; Domingo-Ferrer et al., 2009;

Lindell and Waisbard, 2010; Romero-Tris et al., 2011;

Kim and Kim, 2012)) have been proposed to address

this problem. In these systems, the main measure of

efﬁciency is the round complexity, and it is impor-

tant to construct constant-round PWS systems while

guaranteeing privacy. In some cases, PWS schemes

without strong privacy guarantees may sufﬁce, and

we know how to construct such protocols, e.g., us-

ing a proxy. However, in this work, we focus on

cryptography-based PWS (CB-PWS) systems, where

strong privacy is another important design goal.

To our knowledge, known CB-PWS construc-

tions require O(n) rounds, where n is the number of

users (Castell

a-Roca et al., 2009; Lindell and Wais-

bard, 2010; Romero-Tris et al., 2011), or signiﬁcantly

restrict the length of messages to be encrypted and

hence do not lead to practical solutions to the prob-

lem (Kim and Kim, 2012). In addition, the latter

requires web search engines to implement and run

the protocols. Search engines, however, do not have

any incentives to implement costly protocols that they

cannot proﬁt from. Unfortunately, there are no known

constructions of practical constant-round CB-PWSs.

We brieﬂy survey what is known in this regard

about a private web search. We then summarize our

contributions and provide a high-level overview of

our construction.

1.1 Literature Review

Similar to (Balsa et al., 2012), we also believe that it

is convenient to classify existing PWS solutions into

205

Kang B., Goh S. and Kim M..

Private Web Search with Constant Round Efﬁciency.

DOI: 10.5220/0005225602050212

In Proceedings of the 1st International Conference on Information Systems Security and Privacy (ICISSP-2015), pages 205-212

ISBN: 978-989-758-081-9

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

three categories. One category is based on a proxy

server, called a proxy-base PWS solution; the sec-

ond uses various randomizing techniques and is thus

called an obfuscation-based PWS solution; and the

third is a cryptography-based PWS solution. In the

following, we provide somewhat detailed descriptions

of each system. We begin with a proxy-based PWS

scheme.

Proxy-based PWS: The ﬁrst approach is through

the use of an anonymous proxy (e.g., (Anonymizer,

2014; Scroogle, 2014; Saint-Jean et al., 2007)). Users

can expect that anonymizers will prohibit the creation

of user proﬁles through query unlinkability. There are

several options in this category, from simple mech-

anisms achieving a low level of anonymity in web

searches to more reliable but more complicate sys-

tems based on onion routing (Reed et al., 1998), such

as the Tor network (Dingledine et al., 2004). How-

ever, the effectiveness of simple solutions is clearly

limited. In addition, as highlighted by (Castell

a-Roca

et al., 2009), Tor cannot be installed and conﬁgured

with relative ease. Further, it is well known that the

HTTP requests over Tor can become very slow (Saint-

Jean et al., 2007). For example, it takes 10 seconds on

average to submit a query to Google even when using

paths of length 2 (the default length is 3).

Obfuscation-based PWS: Another approach to

providing privacy during web search is based

on a query obfuscation technique (e.g., (Elovici

et al., 2006; Domingo-Ferrer et al., 2009; Rebollo-

Monedero and Forn

e, 2010; TracMeNot, 2014)).

Roughly speaking, a class of solutions using query

obfuscation involves blending the real queries into a

stream of fake queries so that web search engines can-

not create a correct proﬁle. From a privacy point of

view, these obfuscation-based solutions have a criti-

cal drawback: automated queries have features that

are different from the actual queries entered by a user,

such as randomness. The authors in (Peddinti and

Saxena, 2010) demonstrated a concrete classiﬁer that

can distinguish real queries from fake queries gener-

ated by TrackMeNot (TracMeNot, 2014), with a mean

misclassiﬁcation rate of only approximately 0.02%.

Cryptography-based PWS: The last class of solu-

tions involves using cryptographic algorithms, such

as public-key encryption and shufﬂe. One of the main

advantages of CB-PWSs over other approaches is that

they provide strong privacy guarantees. In addition,

they are not affected by the misclassiﬁcation issue and

are generally faster than anonymizer-based solutions.

To our knowledge, known solutions can be found

in (Castell

a-Roca et al., 2009; Lindell and Waisbard,

2010; Romero-Tris et al., 2011; Kim and Kim, 2012).

Without loss of generality, we may consider Romero-

Tris et al.’s scheme as a malicious variant of Castell

Roca et al.’s scheme. Thus, the difference between

two schemes does not effect the round complexity.

The authors in the above references utilize the ba-

sic idea that upon joining a small-sized group, each

user encrypts his search query and sends it to other

members. Then, according to a predeﬁned order, one

user provides a shufﬂed list of encrypted queries to his

neighbor. Finally, the last user broadcasts its shufﬂed

version. After group decryption, each user obtains a

set of queries, but he can not know who submitted

which query. As a result, web search engines cannot

build user proﬁles.

Further, we would like to note that the only ap-

proach that comes close to achieving our require-

ments in the restricted setting is the work by Kim et

al. (Kim and Kim, 2012). The authors proposed a

PWS scheme based on the notion of decomposable

encryption. However, this approach signiﬁcantly re-

stricts the length of plaintexts (e.g., to 3 or 4 bits) to

be encrypted and hence does not lead to practical so-

lutions to the problem. Later, we provide a detailed

evaluation and analysis of existing CB-PWS solutions

(see Section 4).

1.2 Our Contributions

Our main contribution is that the ﬁrst practical pro-

tocol has only O(n) modular exponentiations and a

constant number of rounds at the user side, where n

is the number of users (i.e., the group size). Accord-

ing to our analysis of existing CB-PWS schemes (i.e.,

(Castell

a-Roca et al., 2009; Lindell and Waisbard,

2010; Romero-Tris et al., 2011)), existing solu-

tions require O(n) computation complexity and O(n)

rounds under the same conditions.

Assume that a user has words or sentences for a

query and is about to submit the query to a speciﬁc

search engine. Using our protocol, the user ﬁrst dis-

perses his query term into n pieces using Shamir’s

secret sharing scheme. This can be very efﬁciently

performed in a ﬁnite ﬁeld (see Section 4.1). Then, en-

crypting each share into n ciphertexts under a public-

key cryptosystem, which requires at most O(n) mod-

ular exponentiations, each user broadcasts the en-

crypted shares to the corresponding users. Finally, all

users send a list of re-masked and shufﬂed ciphertexts

to a group manager. (In CB-PWS solutions, a small

group of users is created and maintained by a speciﬁc

entity called a group manager. We will explain the

entity later.)

ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy

206

Our key technical contribution is distributing a

query term instead of a private key among n users.

Roughly speaking, known CB-PWS schemes demand

that each user computes only a single ciphertext of

the query term, but a collection of all n ciphertexts

should be shufﬂed in a relay manner among all users.

This step is essential to provide unlinkability between

query terms and users, but it leads to O(n) round com-

plexity. Our scheme does not require users to partici-

pate in sequential shufﬂing.

1.3 Our Setting

We work in a setting consisting of the following three

semi-honest entities:

• Users. The users are the individuals who submit

query terms to the search engine and who wish

to prevent the search engine from building their

proﬁles. We use u to denote a user.

• The group manager. The role of the group man-

ager, denoted by G, is to group users so that they

execute our protocol that was introduced as above.

We assume that the group manager has fairly pow-

erful computing resources and storage capacity

compared to the users.

• The search engine. The web search service

provider, denoted by W , is the entity that provides

a list of best-matching web pages, usually along

with a short summary and/or, sometimes, parts of

the document. A typical example is Google. Note

that the search engine has no incentives to protect

users’ privacy.

We consider that an adversary is not allowed

to break current computationally secure encryption

schemes. We assume that there are at least two honest

users. However, in contrast to (Castell

a-Roca et al.,

2009), we allow collusions between two entities of

the protocol.

1.4 A High-level Overview of Our

Solution

In what follows, we provide a high-level description

of our construction and the techniques used therein.

To obtain our construction, we build on the notion of

secret sharing. The basic idea of Shamir’s (t,n)-secret

sharing is that a user can use a polynomial f (X) of de-

gree t − 1 to split a secret q into n shares: (v

,...,v

Then, any collection of shares ≥ t from the distributed

n shares allows one to recover the secret using the La-

grange interpolation formula.

The starting point of the design of our solution is

Shamir’s secret sharing scheme: To submit a query

term q, the user chooses a random polynomial f (X)

of degree n − 1 whose constant term is q and evalu-

ates v

= f (u

) at each user’s label u

. Then, each user

computes encryptions ¯v

= E

) for all 1 ≤ i ≤ n.

Here, E

(·) is a public-key encryption algorithm, and

is assumed to be in the message space of the en-

cryption algorithm. Next, each user broadcasts all en-

crypted shares to other users.

After all users obtain a list of encrypted shares,

they perform re-masking and permutation of the list

and send it to a group manager without changing the

original plaintexts of the ciphertext list. Using the

homomorphic property of the underlying public-key

encryption, we can efﬁciently and successfully re-

encrypt ciphertexts. For the details of the homomor-

phism, see Section 2.2.

At ﬁrst glance, our scheme may seem to be the

same as existing CB-PWS schemes. However, we

emphasize that in the above step of our scheme, each

user has a list of ciphertexts that are different from

all other users’ lists, and thus, shufﬂing ciphertexts

does not demand any interaction between neighbors.

To the contrary, every user in existing solutions ob-

tains the same list of ciphertexts. This requires every

user to join in the sequential shufﬂes so that the pro-

tocol can achieve unlinkability. We consider this as a

very legitimate reason to incur a high round complex-

ity O(n). However, our solution results in all users

having different lists of ciphertexts so that the users

do not need to perform shufﬂes sequentially.

To decrypt all received lists of ciphertexts, the

group manager uses Lagrange interpolation to recover

the users’ query terms. The group manager then sub-

mits the recovered queries to the search engine and

broadcasts the search results to the group users.

Outline of the Paper: This work is organized as

follows. Section 2 introduces cryptographic build-

ing blocks: secret sharing and public-key encryption.

Section 3 provides a detailed description of our con-

struction. In Section 4, we provide the construction’s

performance and a security analysis.

2 BACKGROUND

In this section, we review the concepts and notation

of cryptographic building blocks. We begin with a

review of secret sharing techniques and then recall

public-key cryptography and its security deﬁnitions.

Notation: For n ∈ N, [n] denotes the set {1,...,n}.

If A is a probabilistic polynomial-time (PPT) ma-

chine, we use a ← A to denote making A produce an

PrivateWebSearchwithConstantRoundEfficiency

207

output according to its internal randomness. In par-

ticular, if U is a set, then r

←− U is used to denote

sampling from the uniform distribution on U.

We denote by λ a security parameter. A function

g : N → R is called negligible if for every positive

polynomial µ(·), there is an integer N such that g(n) <

1/µ(n) for all n > N. We use standard asymptotic O

notation to denote the growth of positive functions.

2.1 Secret Sharing

A secret sharing scheme is a method of distributing

a secret, usually a key, among a group of users, re-

quiring a cooperative effort to determine the key, so

that the plaintext can subsequently be decrypted. The

ultimate goal of the scheme is to divide the secret be-

ing hidden into n shares but whereby any subset of

t shares can be used together to solve for the value

of the secret. Additionally, any subset of t − 1 shares

will prevent the secret from being reconstructed. This

is deﬁned as a (t,n)-threshold scheme, meaning that

the secret is dispersed into n overall pieces, with any

t pieces being able to recreate the original secret.

In this work, we use Shamir’s secret sharing

scheme (Shamir, 1979). Shamir’s scheme is based

on polynomial interpolation and takes t points on the

Cartesian plane; using those t points, a unique poly-

nomial f (X) is guaranteed to exist such that f (X) = y

for each of the points given. Regardless, this polyno-

mial f (X) is of degree t −1, and the coefﬁcient for the

degree is equal to a given secret q. Overall, the full

equation for f (X) is given as follows, with q = a

f (X) = a

+ a

X +··· + a

t−1

2.2 Public-key Encryption

A public-key encryption scheme E = (KG,E,D) con-

sists of the following algorithms:

• KG is a randomized algorithm that takes a security

parameter λ as input and outputs a secret key sk

and a public key pk; pk deﬁnes a plaintext space

and a ciphertext space C

• E is a randomized algorithm that takes pk and a

plaintext m ∈ M

as input and outputs a cipher-

text c ∈ C

. Note, this process is usually random-

ized using a randomization value r ∈ R

c = E

(m;r)

• D takes sk and c ∈ C

as input and outputs the

plaintext m.

We say that an encryption scheme is correct if, for

any key-pair (pk, sk) ←− KG(1

) and any m ∈ M

m ← D

(m)).

We say that a public-key cryptosystem E =

(KG,E,D) is homomorphic for the binary relations

(⊕,⊗) if for all (pk,sk) ← KG(1

• Given the message domain M

, (M

,⊕) forms

a group.

• Given the ciphertext range C

, (C

,⊗) forms a

group.

• For all c

∈ C

, D

⊗ c

) = D

) ⊕

As a consequence, a cryptosystem’s homomor-

phic property allows it to perform reencryption: given

a ciphertext c, anyone can create a different ciphertext

˜c that encodes the same plaintext as c. Thus, given a

homomorphic encryption scheme E, we can deﬁne

the reencryption algorithm as follows:

(c;r) = c ⊗ E

;r)

where m

is an identity message such that ∀m ∈

,m ⊕ m

= m. Further, D

(RE

(c)) = m, too.

Remark 1. Herein, we will simply assume that we

have some secret sharing scheme with the key K and

a public-keycryptosystem with the key pk. The keys

may or may not overlap, whereby elements from one

key are also included in another key. For instance,

we could imagine that the cryptosystem was an ElGa-

mal scheme working on a group G of order q from

primes p,q|p − 1 and a generator g and that the se-

cret sharing scheme used the same message space as

the encryption scheme, i.e., M

= M

3 OUR CONSTRUCTION FOR

PRIVATE WEB SEARCHING

We now describe our new technique for private web

search in the semi-honest model. Our scheme enjoys

a computational efﬁciency that is similar to existing

CB-PWS schemes but does not require rounds in pro-

portion to the number of users.

3.1 The Proposed Scheme

Our scheme is logically divided into three phases:

Setup, Mixing query, and Submitting query. We pro-

vide an abstract description of our proposal in Fig-

ure 1.

ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy

208

Protocol: Private Web Search

– Inputs: For all i ∈ [n], the input of u

is a query term q

– Auxiliary inputs: a security parameter 1

– The protocol actions:

1. Setup. This step consists of two major tasks as follows:

• Group setup: A group of n users is created, and the group information is given to the users by

the group manager.

• Parameter setup. The group manager generates the parameters for a public-key encryption

scheme and a key for Shamir’s secret sharing scheme and makes them publicly accessible to

the users.

2. Mixing query. In this step, the users perform the following:

(a) Splitting the query term q

into n shares using Shamir’s secret sharing scheme.

(b) Encrypting the shares into a list of ciphertexts under the group manager’s public key.

(d) Re-encrypting and mixing the list of ciphertexts.

(e) Sending the list of ciphertexts to the group manager.

3. Submitting query. In this step, the group manager recovers all query terms by applying the

Lagrange interpolation technique to the decrypted lists. Then, the group manager submits all

query terms to the search engine.

Figure 1: A High-level Description of Our Protocol.

3.1.1 Setup

Let E = (KG,E,D) be a semantically secure public-

key cryptosystem with the homomorphic property,

and let K be a public key specifying a list of parame-

ters for Shamir’s secret sharing scheme. It also speci-

ﬁes how to efﬁciently perform polynomial evaluation

and interpolation (see Section 2.1). The Setup phase

is composed of two primary activities:

1. When the group manager G receives n requests for

private query, it responds to all n users by saying

that the group size is n. Then, the group manager

constructs a group {u

,...,u

} and publishes

the group information. Speciﬁcally, the group in-

formation may include the group name, a list of

participating users, and each user’s label. We call

this a group setup.

2. The group manager chooses a pair of keys (pk,sk)

by invoking (pk,sk)

←− KG(1

) and selects a

proper key K with the message compatibility. It

then publishes all system parameters (pk,K) for

the protocol. As a result, the users have M

. We call this sub-step parameter setup.

3.1.2 Mixing Query

After obtaining the system parameters from G as a

response to its query request, each user u

i∈[n]

performs

the following:

1. Upon receiving the parameters, u

chooses n − 1

random coefﬁcients r

i, j

∈ M

and determines

(X) = r

i,n−1

n−1

+ ··· + r

i,1

X +q

where q

∈ M

is u

’s query term.

2. Each user computes shares v

i, j

of his query q

for

every j ∈ [n] by

( j) =

n−1

∑

k=1

i,k

+ q

We deﬁne v

i, j

:= R

( j) for all i, j ∈ [n].

3. For each j ∈ [n], u

computes ¯v

i, j

= E

i, j

) and

broadcasts a list of ciphertexts hi, j, ¯v

i, j

j∈[n]\{i}

all other users.

4. Assume that each user has the following list of

encrypted shares

⊥ ···

h1,i, ¯v

1,i

i ··· h1,n, ¯v

1,n

h2,1, ¯v

2,1

i ··· h2,i, ¯v

2,i

i ··· h2,n, ¯v

2,n

hn,1, ¯v

n,1

i ···

hn,i, ¯v

n,i

i ··· ⊥

where ⊥ indicates that a correct encrypted share

is unknown to the corresponding cell.

In the above list, because the i-th column is com-

plete, the user u

sets

= (¯v

1,i

,..., ¯v

n,i

). Then,

the user computes a new version of the list

PrivateWebSearchwithConstantRoundEfficiency

209

(˜v

1,i

,..., ˜v

n,i

), where ˜v

`,i

= RE



¯v

( j),i



for all

`, j ∈ [n] and for a random permutation π

over

[n].

5. Each user sends

to the group manager.

3.1.3 Submitting Query

The group manager performs the following steps:

1. The group manager constructs the following n ×n

matrix M by decrypting all of ciphertexts in the n

vectors:

M =



















(1),1

··· v

(n),1

(1),2

··· v

(n),2

(1),n

··· v

(n),n







2. Because G has no order information for each row,

it sequentially recovers each query term by apply-

ing the Lagrange interpolation formula to each el-

ement in M.

3. The group manager submits a set of query terms to

the search engine W. The group manager broad-

casts the output from W .

Remark 2. We have some options regarding the re-

covery of query terms. The simplest option that the

group manager can choose is for the group manager

to send all query terms reconstructed using the La-

grange interpolation formula in the Submitting Query

step. This technique clearly ensures that all of the n

original query terms are correctly determined. How-

ever, its major disadvantage is that it incurs high com-

putation complexity.

The alternative, albeit controversially, is to use

a smart engine that can check if each reconstructed

query term is in a dictionary maintained by the en-

gine.

4 ANALYSIS

This section analyzes the performance of our con-

struction in terms of the efﬁciency requirements.

Next, we analyze the security of our protocol by ex-

amining all behaviors of the protocol.

4.1 Performance Analysis

Our protocol is compared to other CB-PWS solutions

in terms of three efﬁciency measures: computation,

communication and rounds. For this purpose, we

ﬁrst analyze the performance of our proposal. Then,

the schemes proposed by (Castell

a-Roca et al., 2009)

and (Lindell and Waisbard, 2010) are compared to

our proposal. We do not know how to provide a fair

comparison between the scheme proposed by (Kim

and Kim, 2012) and our scheme because our scheme

allows query terms that are n-times longer than the

maximum length of query terms allowed in (Kim and

Kim, 2012).

Parameter Selection: Before conducting compar-

isons, we ﬁrst need to determine the system parame-

ters, the group size (n) and the key size.

We believe that the time users must wait to form

the group determines the group size n. Quickly cre-

ating the group makes it possible to reduce the query

delay. However, the larger the the group, the more pri-

vacy the protocol achieves. Therefore, one way that

the members can obtain strong privacy is a scheme

whereby a user joins a different small-sized group

every time the user submits a query. According

to (Castell

a-Roca et al., 2009), n = 3 is the most real-

istic group size in practice. The authors in (Castell

Roca et al., 2009) stated that, with an overwhelming

probability, a group of n = 3 users can be created in a

hundredth of a second.

Regarding the key length, we take a 1024-bit key

length, as in other solutions. Thus, a 1024-bit key can

encrypt up to 128 bytes at a time; therefore, a public-

key cryptosystem that uses a 1024-bit key length can

address queries of approximately 64 characters.

Computation Complexity: Now, we analyze the

computation cost for running the protocol. In gen-

eral, because it is widely accepted that modular ex-

ponentiations dominate the total computation cost of

a system, we also focus on the number of modular

exponentiations that every user must perform during

execution of each protocol. For a fair comparison, we

assume that our construction also employs an ElGa-

mal encryption scheme.

For this purpose, we denote by ME(`) a modu-

lar exponentiation modulo of an `-bit integer value.

The scheme in (Lindell and Waisbard, 2010) exten-

sively uses a double encryption by combining ElGa-

mal encryption and Cramer and Shoup’s cryptosys-

tem (Cramer and Shoup, 1998). Thus, some modu-

lar exponentiations in (Lindell and Waisbard, 2010)

should be carried out modulus a 2048-bit integer

rather than a 1024-bit integer, as in (Castell

a-Roca

et al., 2009) and in our scheme. Our experimental

implementation without any optimization shows that

a 1024-bit modular exponentiation is approximately

10 times faster than a 2048-bit modular exponentia-

tion. Speciﬁcally, a 1024-bit modular exponentiation

ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy

210

Table 1: Complexity Comparison at the User Side.

Our scheme Castell

a-Roca et al.’s scheme Lindell and Waisbard’s scheme

Computation

3n · ME(1024) (3n + 3) · ME(1024) (n + 3) · ME(1024) + 11n · ME(2048)

Complexity

Communication

4n 3n − 2 4n − 2

Complexity

Round

5 n + 6 n + 6

Complexity

takes 18 msec on average, whereas a 2048-bit modu-

lar exponentiation takes 191 msec on average.

We provide a comparison of schemes in the ﬁrst

row of Table 1 using the computation complexity. We

remark that the evaluation of a polynomial in a ﬁnite

ﬁeld of degree less than n at n points can be performed

using at most O(M(n)logn) ﬁeld operations, where

M(n) is the number of bit operations for multiplying

two n-bit integers. Similarly, Lagrange interpolation

can be computed with the asymptotically same com-

putation.

We observe that our scheme obtains the lowest

computation cost from the user’s point of view. Of

course, our scheme and that in (Castell

a-Roca et al.,

2009) do not provide any mechanism to protect the

honest users against malicious adversaries. How-

ever, even if we use zero-knowledge proofs to achieve

active security, we conjecture that our scheme still

has O(n) computation complexity, as in (Lindell and

Waisbard, 2010).

Remark 3. In our scheme, the group manager has

to perform somewhat heavy computations, while the

group manager in other existing CB-PWS protocols

only plays a role in maintaining a group of users.

However, in practice, because n is small (e.g., n = 3 or

4), we believe that these computations may not con-

siderably affect the whole performance.

Communication and Round Complexity: First, it

is clear that our protocol only incurs a constant num-

ber of rounds. Next, we compare the communication

complexity by counting the number of messages that

every user should send in each step of the protocols.

Table 1 summarizes the comparison results.

Our proposal consists of ﬁve rounds: (1) obtain-

ing the system parameters, (2) applying Shamir’s se-

cret sharing and encryption to a query term and broad-

casting a resulting list, (3) shufﬂing the resulting set

and sending it to the group manager, (4) recovering a

set of query terms and submitting to a search engine,

and, ﬁnally, (5) broadcasting the search results to the

users. All CB-PWS schemes have O(n) communica-

tion complexity. In particular, our scheme only re-

quires 5 rounds, while other CB-PWS proposals have

O(n) round complexity.

4.2 Security Analysis

Our construction achieves the following privacy re-

quirements of the users when they submit query terms

to a search engine:

Unlinkability Among Users: Let J ⊂ [n] be a set of

semi-honest users such that |J| = η < n. For notional

convenience, suppose that ˆu ∈ J is of the index α ∈ [n]

but that it follows all steps of the protocol. At the

end of the Mixing Query step, ˆu receives a list of en-

crypted shares from other users: (¯v

1,α

, ¯v

2,α

,..., ¯v

n,α

If ˆu would be able to decrypt these ciphertexts, it

would be able to ﬁll in some parts of the matrix M.

Nevertheless, as long as M is not completed, ˆu can-

not know the queries of honest users. For example,

let u

be only the honest users. Denoting by  a

decrypted share, ˆu would be able to obtain at most the

following matrix:







?   · · · 

 ?  ·· · 

   ··· 







Hence, ˆu is unable to obtain q

and q

, and thus, our

solution preserves the users’ privacy.

Unlinkability between the Group Manager and the

Users: Our protocol allows two entities (i.e., be-

tween ˆu and G) to collude. In the middle of the

Submitting Query phase, a compromised group man-

ager and semi-honest users have the valid matrix, but

they still do not know the secret permutations of hon-

est users. Therefore, even though the group manager

PrivateWebSearchwithConstantRoundEfficiency

211

would be compromised, the attacker could only ran-

domly guess that a “link” is correct with a probability

that is only negligibly greater than

n−η

Unlinkability between the Search Engine and the

Users: Our protocol allows the search engine to par-

ticipate in the execution of the protocol, except only at

the end of the last phase. For reasons similar to those

above, it cannot link a certain query to an honest user;

therefore, it cannot build proﬁles for honest users.

5 CONCLUDING REMARKS AND

FURTHER RESEARCH

In this work, we presented a constant-round CB-PWS

protocol for protecting users’ privacy. Our solution

can be easily deployed in current systems because it

does not require any changes on the service provider

side. However, the following work remains for further

research:

• We will try to provide a more rigorous security

proof using standard techniques, such as simula-

tion or game-playing proof. For this purpose, we

ﬁrst need to precisely deﬁne the notion of privacy

in this setting.

• We should improve the performance of the group

manager side, especially when the size of the

group is large.

ACKNOWLEDGEMENTS

This research was supported by the MSIP(Ministry

of Science, ICT and Future Planning), Korea, under

the Specialized Co-operation between industry and

academic support program (NIPA-2014-H0808-14-

1003) supervised by the NIPA(National IT Industry

Promotion Agency). Myungsun Kim was supported

by Basic Science Research Program through the Na-

tional Research Foundation of Korea (NRF) funded

by the Ministry of Education (2014R1A1A2058377).

REFERENCES

Anonymizer (2014). Anonymizer. http://www.anonymizer.

com.

Balsa, E., Troncoso, C., and D

ıaz, C. (2012). OB-PWS:

Obfuscation-based private web search. In IEEE Sym-

posium on Security and Privacy, pages 491–505.

Castell

a-Roca, J., Viejo, A., and Herrera-Joancomart

ı, J.

(2009). Preserving user’s privacy in web search en-

gines. Computer Communications, 32(13-14):1541–

1551.

Cramer, R. and Shoup, V. (1998). A practical public key

cryptosystem provably secure against adaptive chosen

ciphertext attack. In Krawczyk, H., editor, Advances

in Cryptology-Crypto, LNCS 1462, pages 13–25.

Dingledine, R., Mathewson, N., and Syverson, P. (2004).

Tor: The second-generation onion router. In Blaze,

M., editor, USENIX Security Symposium, pages 303–

320.

Domingo-Ferrer, J., Solanas, A., and Castell

a-Roca, J.

(2009). h(k)-private information retrieval from

privacy-uncooperative queryable databases. Online

Information Review, 33(4):720–744.

Elovici, Y., Shapira, B., and Meshiach, A. (2006).

Cluster-analysis attack against a private web solution

(PRAW). Online Information Review, 30(6):624–643.

Kim, M. and Kim, J. (2012). Privacy-preserving web

search. In ICUFN, pages 480–481.

Lindell, Y. and Waisbard, E. (2010). Private web search

with malicious adversaries. In Atallah, M. and Hop-

per, N., editors, Privacy Enhancing Technologies,

LNCS 6205, pages 220–235.

Peddinti, S. T. and Saxena, N. (2010). On the privacy of web

search based on query obfuscation: A case study of

TrackMeNot. In Atallah, M. and Hopper, N., editors,

Privacy Enhancing Technologies, LNCS 6205, pages

19–37.

Rebollo-Monedero, D. and Forn

e, J. (2010). Optimized

query forgery for private information retrieval. IEEE

Transactions on Information Theory, 56(9):4631–

4642.

Reed, M., Syverson, P., and Goldschlag, D. (1998). Anony-

mous connections and onion routing. IEEE Journal on

Selected Areas in Communications, 16(4):482–494.

Romero-Tris, C., Castell

a-Roca, J., and Viejo, A. (2011).

Multi-party private web search with untrusted part-

ners. In Rajarajan, M., Piper, F., Wang, H., and Ke-

sidis, G., editors, SecureComm, pages 261–280.

Saint-Jean, F., Johnson, A., Boneh, D., and Feigenbaum, J.

(2007). Private web search. In Ning, P. and Yu, T.,

editors, WPES, pages 84–90.

Scroogle (2014). Scroogle, http://scroogle.org.

Shamir, A. (1979). How to share a secret. Communications

of the ACM, 22(11):612–613.

TracMeNot (2014). TracMeNot, http://mrl.nyu.edu/dhowe

/trackmenot.

ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy

212