three categories. One category is based on a proxy
server, called a proxy-base PWS solution; the sec-
ond uses various randomizing techniques and is thus
called an obfuscation-based PWS solution; and the
third is a cryptography-based PWS solution. In the
following, we provide somewhat detailed descriptions
of each system. We begin with a proxy-based PWS
scheme.
Proxy-based PWS: The first approach is through
the use of an anonymous proxy (e.g., (Anonymizer,
2014; Scroogle, 2014; Saint-Jean et al., 2007)). Users
can expect that anonymizers will prohibit the creation
of user profiles through query unlinkability. There are
several options in this category, from simple mech-
anisms achieving a low level of anonymity in web
searches to more reliable but more complicate sys-
tems based on onion routing (Reed et al., 1998), such
as the Tor network (Dingledine et al., 2004). How-
ever, the effectiveness of simple solutions is clearly
limited. In addition, as highlighted by (Castell
`
a-Roca
et al., 2009), Tor cannot be installed and configured
with relative ease. Further, it is well known that the
HTTP requests over Tor can become very slow (Saint-
Jean et al., 2007). For example, it takes 10 seconds on
average to submit a query to Google even when using
paths of length 2 (the default length is 3).
Obfuscation-based PWS: Another approach to
providing privacy during web search is based
on a query obfuscation technique (e.g., (Elovici
et al., 2006; Domingo-Ferrer et al., 2009; Rebollo-
Monedero and Forn
´
e, 2010; TracMeNot, 2014)).
Roughly speaking, a class of solutions using query
obfuscation involves blending the real queries into a
stream of fake queries so that web search engines can-
not create a correct profile. From a privacy point of
view, these obfuscation-based solutions have a criti-
cal drawback: automated queries have features that
are different from the actual queries entered by a user,
such as randomness. The authors in (Peddinti and
Saxena, 2010) demonstrated a concrete classifier that
can distinguish real queries from fake queries gener-
ated by TrackMeNot (TracMeNot, 2014), with a mean
misclassification rate of only approximately 0.02%.
Cryptography-based PWS: The last class of solu-
tions involves using cryptographic algorithms, such
as public-key encryption and shuffle. One of the main
advantages of CB-PWSs over other approaches is that
they provide strong privacy guarantees. In addition,
they are not affected by the misclassification issue and
are generally faster than anonymizer-based solutions.
To our knowledge, known solutions can be found
in (Castell
`
a-Roca et al., 2009; Lindell and Waisbard,
2010; Romero-Tris et al., 2011; Kim and Kim, 2012).
Without loss of generality, we may consider Romero-
Tris et al.’s scheme as a malicious variant of Castell
´
a-
Roca et al.’s scheme. Thus, the difference between
two schemes does not effect the round complexity.
The authors in the above references utilize the ba-
sic idea that upon joining a small-sized group, each
user encrypts his search query and sends it to other
members. Then, according to a predefined order, one
user provides a shuffled list of encrypted queries to his
neighbor. Finally, the last user broadcasts its shuffled
version. After group decryption, each user obtains a
set of queries, but he can not know who submitted
which query. As a result, web search engines cannot
build user profiles.
Further, we would like to note that the only ap-
proach that comes close to achieving our require-
ments in the restricted setting is the work by Kim et
al. (Kim and Kim, 2012). The authors proposed a
PWS scheme based on the notion of decomposable
encryption. However, this approach significantly re-
stricts the length of plaintexts (e.g., to 3 or 4 bits) to
be encrypted and hence does not lead to practical so-
lutions to the problem. Later, we provide a detailed
evaluation and analysis of existing CB-PWS solutions
(see Section 4).
1.2 Our Contributions
Our main contribution is that the first practical pro-
tocol has only O(n) modular exponentiations and a
constant number of rounds at the user side, where n
is the number of users (i.e., the group size). Accord-
ing to our analysis of existing CB-PWS schemes (i.e.,
(Castell
`
a-Roca et al., 2009; Lindell and Waisbard,
2010; Romero-Tris et al., 2011)), existing solu-
tions require O(n) computation complexity and O(n)
rounds under the same conditions.
Assume that a user has words or sentences for a
query and is about to submit the query to a specific
search engine. Using our protocol, the user first dis-
perses his query term into n pieces using Shamir’s
secret sharing scheme. This can be very efficiently
performed in a finite field (see Section 4.1). Then, en-
crypting each share into n ciphertexts under a public-
key cryptosystem, which requires at most O(n) mod-
ular exponentiations, each user broadcasts the en-
crypted shares to the corresponding users. Finally, all
users send a list of re-masked and shuffled ciphertexts
to a group manager. (In CB-PWS solutions, a small
group of users is created and maintained by a specific
entity called a group manager. We will explain the
entity later.)
ICISSP2015-1stInternationalConferenceonInformationSystemsSecurityandPrivacy
206