three categories. One category is based on a proxy
server, called a proxy-base PWS solution; the sec-
ond uses various randomizing techniques and is thus
called an obfuscation-based PWS solution; and the
third is a cryptography-based PWS solution. In the
following, we provide somewhat detailed descriptions
of each system. We begin with a proxy-based PWS
Proxy-based PWS: The first approach is through
the use of an anonymous proxy (e.g., (Anonymizer,
2014; Scroogle, 2014; Saint-Jean et al., 2007)). Users
can expect that anonymizers will prohibit the creation
of user profiles through query unlinkability. There are
several options in this category, from simple mech-
anisms achieving a low level of anonymity in web
searches to more reliable but more complicate sys-
tems based on onion routing (Reed et al., 1998), such
as the Tor network (Dingledine et al., 2004). How-
ever, the effectiveness of simple solutions is clearly
limited. In addition, as highlighted by (Castell
et al., 2009), Tor cannot be installed and configured
with relative ease. Further, it is well known that the
HTTP requests over Tor can become very slow (Saint-
Jean et al., 2007). For example, it takes 10 seconds on
average to submit a query to Google even when using
paths of length 2 (the default length is 3).
Obfuscation-based PWS: Another approach to
providing privacy during web search is based
on a query obfuscation technique (e.g., (Elovici
et al., 2006; Domingo-Ferrer et al., 2009; Rebollo-
Monedero and Forn
e, 2010; TracMeNot, 2014)).
Roughly speaking, a class of solutions using query
obfuscation involves blending the real queries into a
stream of fake queries so that web search engines can-
not create a correct profile. From a privacy point of
view, these obfuscation-based solutions have a criti-
cal drawback: automated queries have features that
are different from the actual queries entered by a user,
such as randomness. The authors in (Peddinti and
Saxena, 2010) demonstrated a concrete classifier that
can distinguish real queries from fake queries gener-
ated by TrackMeNot (TracMeNot, 2014), with a mean
misclassification rate of only approximately 0.02%.
Cryptography-based PWS: The last class of solu-
tions involves using cryptographic algorithms, such
as public-key encryption and shuffle. One of the main
advantages of CB-PWSs over other approaches is that
they provide strong privacy guarantees. In addition,
they are not affected by the misclassification issue and
are generally faster than anonymizer-based solutions.
To our knowledge, known solutions can be found
in (Castell
a-Roca et al., 2009; Lindell and Waisbard,
2010; Romero-Tris et al., 2011; Kim and Kim, 2012).
Without loss of generality, we may consider Romero-
Tris et al.’s scheme as a malicious variant of Castell
Roca et al.’s scheme. Thus, the difference between
two schemes does not effect the round complexity.
The authors in the above references utilize the ba-
sic idea that upon joining a small-sized group, each
user encrypts his search query and sends it to other
members. Then, according to a predefined order, one
user provides a shuffled list of encrypted queries to his
neighbor. Finally, the last user broadcasts its shuffled
version. After group decryption, each user obtains a
set of queries, but he can not know who submitted
which query. As a result, web search engines cannot
build user profiles.
Further, we would like to note that the only ap-
proach that comes close to achieving our require-
ments in the restricted setting is the work by Kim et
al. (Kim and Kim, 2012). The authors proposed a
PWS scheme based on the notion of decomposable
encryption. However, this approach significantly re-
stricts the length of plaintexts (e.g., to 3 or 4 bits) to
be encrypted and hence does not lead to practical so-
lutions to the problem. Later, we provide a detailed
evaluation and analysis of existing CB-PWS solutions
(see Section 4).
1.2 Our Contributions
Our main contribution is that the first practical pro-
tocol has only O(n) modular exponentiations and a
constant number of rounds at the user side, where n
is the number of users (i.e., the group size). Accord-
ing to our analysis of existing CB-PWS schemes (i.e.,
a-Roca et al., 2009; Lindell and Waisbard,
2010; Romero-Tris et al., 2011)), existing solu-
tions require O(n) computation complexity and O(n)
rounds under the same conditions.
Assume that a user has words or sentences for a
query and is about to submit the query to a specific
search engine. Using our protocol, the user first dis-
perses his query term into n pieces using Shamir’s
secret sharing scheme. This can be very efficiently
performed in a finite field (see Section 4.1). Then, en-
crypting each share into n ciphertexts under a public-
key cryptosystem, which requires at most O(n) mod-
ular exponentiations, each user broadcasts the en-
crypted shares to the corresponding users. Finally, all
users send a list of re-masked and shuffled ciphertexts
to a group manager. (In CB-PWS solutions, a small
group of users is created and maintained by a specific
entity called a group manager. We will explain the
entity later.)