Secure Keyword Search over Data Archives in the Cloud

Performance and Security Aspects of Searchable Encryption

Christian Neuhaus

, Frank Feinbube

, Daniel Janusz

and Andreas Polze

Operating Systems and Middleware Group, Hasso Plattner Institut, Potsdam, Germany

DBIS Group, Humboldt-Universit

at zu Berlin, Berlin, Germany

Keywords:

Keyword Search, Searchable Encryption, Cloud Computing, Performance, Security, Data Conﬁdentiality.

Abstract:

Encryption can protect the conﬁdentiality of data stored in the cloud, but also prevents search. To solve this

problem, searchable encryption schemes have been proposed that allow keyword search over document collec-

tions. To investigate the practical value of such schemes and the tradeoff between security, functionality and

performance, we integrate a prototypical implementation of a searchable encryption scheme into a document-

oriented database. We give an overview of the performance benchmarking results of the approach and analyze

the threats to data conﬁdentiality and corresponding countermeasures.

1 INTRODUCTION

Data sharing is essential to companies and govern-

ment services alike. A striking example is healthcare,

where doctor’s ofﬁces, hospitals, and administrative

institutions rely on exchange of information to offer

the best level of care and optimizing cost efﬁciency

at the same time. For scenarios like these, moving

to the cloud solves many problems: The scalability of

the cloud makes resources simple to provision and ex-

tend and centralization of data improves the availabil-

ity and helps to avoid information silos. Most impor-

tantly, cloud computing helps to reduce IT expenses

– an effect most welcome in healthcare. However,

concerns about data conﬁdentiality still prevent the

use of cloud in many domains. Traditional encryp-

tion is of little help: It effectively protects the pri-

vacy of data but also prevents important operations

such as search. While efﬁcient encryption schemes

that enable generic operations on encrypted data are

still elusive, searching over encrypted data is possi-

ble: searchable encryption schemes enable keyword

search without disclosing these keywords to the cloud

operator. The query performance of such schemes

cannot match unencrypted operation, but may well

be suitable for areas of application such as electronic

health records, where data has to be retrieved from a

cloud-hosted archive.

In this paper, we investigate the trade-off between

performance and security when using searchable en-

cryption schemes for data archives in the cloud. We

make the following contributions:

1) We report on an architecture for integrating Gohs

Z-IDX searchable encryption scheme (Goh et al.,

2003) into a database and present a practical imple-

mentation by the example of MongoDB.

2) We discuss the overhead introduced by encrypted

search and provide benchmark results on the perfor-

mance of using Gohs scheme for encrypted search

with MongoDB. These benchmarks give a meaning-

ful account of the practical performance and usability

of searchable encryption in databases.

3) We give a qualitative assessment of the security im-

plications of using searchable encryption schemes for

cloud data archives using attack-defense-tree models.

This assessment is generic to searchable encryption

and not limited to Goh’s scheme. We also discuss

mitigation strategies to manage threats by statistical

inference attacks.

2 RELATED WORK

In this section, we review related work in the ﬁeld of

private database outsourcing and searchable encryp-

tion.

Private Database Outsourcing. Outsourcing pri-

vate data to a remote database inherently bears the

risk of exposure of conﬁdential information – through

eavesdropping, data theft or malfunctions. The key

427

Neuhaus C., Feinbube F., Janusz D. and Polze A..

Secure Keyword Search over Data Archives in the Cloud - Performance and Security Aspects of Searchable Encryption.

DOI: 10.5220/0005428704270438

In Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER-2015), pages 427-438

ISBN: 978-989-758-104-5

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

challenge is to protect private data from being ac-

cessed by potentially untrusted cloud providers. In

this paper, we focus on technologies that protect data

within a database. While encryption is the basic

mechanism to ensure data conﬁdentiality, providing

an efﬁcient database-as-a-service that can run on en-

crypted data is a challenging task. Several recent ap-

proaches try to offer solutions for outsourcing private

databases.

TrustedDB (Bajaj and Sion, 2011) and Cipherbase

(Arasu et al., 2013) offer SQL database functionali-

ties that support the full generality of a database sys-

tem while providing high data conﬁdentiality. Both

systems use a secure co-processor for performing op-

erations on the cloud server side. The drawbacks of

such approaches are at least twofold: On one hand all

clients have to trust the secure co-processor with their

private data. On the other hand it is not clear how

the co-processor scales up in the number of clients

connected and the amount of data processed. In

CryptDB (Popa et al., 2011), the authors apply an lay-

ered approach that makes use of several cryptographic

schemes, where values are only decrypted to a level

that is required to complete the query.

Another class of approaches aims at processing

encrypted data directly without any decryption. To

this day, there are no efﬁcient encryption schemes

that enable fully encrypted operation of a DBMS

(database management system) without loss of func-

tionality. An early approach for keyword search on

encrypted data was published by (Song et al., 2000).

An approach for securely processing exact match

queries on database cells was proposed by (Yang

et al., 2006). However, most DBMS rely on other

common operations such as range and aggregation

queries as well as updates, inserts and deletes. Ex-

isting approaches cannot efﬁciently process this type

of queries on encrypted data. A common solution

is to reduce data conﬁdentiality to gain query efﬁ-

ciency, e.g., order preserving encryption (Agrawal

et al., 2004) may reveal the underlying data order.

Most methods can be attacked by statistical analy-

sis of the encrypted data or the access patterns. An-

other solution is to lose some query efﬁciency in or-

der to guarantee conﬁdentiality. While (fully) homo-

morphic encryption schemes as proposed by Rivest et

al. (Rivest et al., 1978) in fact allow the encrypted

computation of any circuit (and therefore computer

program), current constructions (see (Gentry, 2009;

Van Dijk et al., 2010)) are yet too inefﬁcient for prac-

tical application.

Traditional databases use indices for efﬁcient

record search. The existing methods have been

adapted to work on encrypted data (Shmueli et al.,

2005). Private indexing (Hore et al., 2004) enable an

untrusted server to evaluate obfuscated range queries

with limited information leakage. Wang et al (Wang

et al., 2011) propose a secure B

-Tree to efﬁciently

process any type of database query. Encrypted index-

based approaches do not rely on any trusted third par-

ties or trusted hardware. This seems to be a practical

and secure method to search in encrypted databases.

The next section discusses searchable encryption.

Searchable Encryption. Searchable encryption

schemes provide one or many cryptographic data

structures called search indices that allow encrypted

keyword search for exact keyword matches. A good

overview of searchable encryption schemes is given

in (Kamara and Lauter, 2010). In general, search-

able encryption schemes do not replace symmetric

encryption schemes but provide the search capability

through additional data structures – the index (see

ﬁgure 1). To provide keyword search on data, a list

Plaintext Keywords

Plaintext Data

Encrypted Index

Encrypted Data

Secret Key

Search Keyword

Create Index

Create Token

Token

other means of encrypon

match

Figure 1: Searchable Encryption: Conceptual View.

of keywords is extracted from the plaintext. This

keyword list is used to create a secure index using

a dedicated secret key for the searchable encryption

scheme. The data is encrypted separately (usually

symmetric block ciphers such as AES) and uploaded

stored alongside the encrypted index in a remote

location. To search over the uploaded data in the

remote location, a search token in generated for a

search keyword using the secret key. This token is is

sent to the remote server. The remote sever can now

determine whether the token matches a search index

without being able to learn the keyword.

Searchable encryption schemes can be distin-

guished between Symmetric Searchable Encryption

(SSE) and Asymmetric Searchable Encryption (ASE)

schemes. SSE schemes use the same secret key both

for insertion and searching of data. In general, they

are more efﬁcient than ASE schemes and provide

stronger security guarantees. They were ﬁrst intro-

duced by (Song et al., 2000), where the authors pro-

vide a linear search capability over ciphertext – one

of the few schemes that does not make use of indices.

To speed up search, the scheme of Goh (Goh et al.,

CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience

428

2003) uses indices that are created separately for ev-

ery searchable data item, which enables efﬁcient up-

date. Improved search time can achieved by using an

inverted index (see e.g. (Curtmola et al., 2006)). A

scheme that enables both efﬁcient updates and opti-

mal search time (linear in the number of documents

that contain the keyword) is offered in a recent con-

struction by Kamara et al. (Kamara et al., 2012).

In contrast, ASE schemes use different keys

for insertion and searching of data, which provides

greater ﬂexibility. However, the constructions of ASE

schemes are generally less efﬁcient than those of SSE

schemes and provide weaker security guarantees. The

ﬁrst construction was given by Boneh et al. (Boneh

et al., 2004) and is based on elliptic curve cryptogra-

phy. Improved constructions were introduced in (Ab-

dalla et al., 2005). Unfortunately, ASE are generally

susceptible to dictionary attacks against search tokens

(see (Byun et al., 2006)). This limits the application

of ASE schemes to use case where keywords are ei-

ther hard to guess or the keyword attack is tolerable.

3 THE Z-IDX SCHEME

For our implementation, we chose the Z-IDX search-

able encryption scheme by (Goh et al., 2003). As a

symmetric scheme, it is not susceptible to dictionary

attacks on search tokens like ASE schemes (see sec-

tion 2). This scheme offers several desirable proper-

ties:

• Maturity. While the ﬁeld of research in search-

able encryption schemes is rather young, Goh’s

scheme was one of the earliest proposed. In con-

trast to more recent constructions, the scheme

passed several years without the discovery of se-

curity ﬂaws.

• Per-document Indexing. The Z-IDX scheme

creates per-document indices. This property fa-

cilitates integration into existing DBMS.

• Standard Cryptographic Primitives. The cryp-

tographic mechanisms used by Z-IDX are widely

available in software libraries for most platforms.

In this section, we give an overview of Bloom Fil-

ters and how they are used to construct Gohs Z-IDX

scheme.

3.1 Bloom Filters

The encrypted indices in Z-IDX make use of space-

efﬁcient probabilistic data structures called bloom ﬁl-

ters (Bloom, 1970). For a set of elements E =

,...,e

}, the set membership information is en-

coded in a bit array of length l. A number of r hash

functions h

,...,h

is selected that map every element

of E to a number ∈ [1; l]. To store the set membership

of an element e

in the ﬁlter, its hash value from every

hash function h

,..h

is calculated. These hash values

),...,h

) are used as index positions in the ﬁl-

ter bit array. At every referenced index position, the

bit in the array is set to 1. To test the set membership

for an element e

, the procedure is similar: All hash

values h

),...,h

) are calculated and used as in-

dex positions in the ﬁlter bit array. If all positions in

the array pointed to by the hash function values are

set to 1, the element is assumed to be in the set.

. . .

...

1 1 1

0 0 0 0 0 0 0 0 0 0 0

10 2 3 4 5 6 7 8 9 10 11 30 31

Hash Funcons

Bloom Filter

Bit Array

Figure 2: Example of a Bloom Filter with a 32-bit array.

This design of bloom ﬁlters can produce false pos-

itives: If all corresponding array positions of an ele-

ment e

were set to 1 by insertion of other elements,

the bloom ﬁlter produces a false positive for e

. On

the other hand, false negatives do not occur. The false

positive rate of a bloom ﬁlter can be inﬂuenced by ad-

justing the size of the bit array and the number of hash

functions used.

3.2 Gohs Secure Indexes

Based on bloom ﬁlters, Goh constructs a secure index

scheme called Z-IDX (Goh et al., 2003) that allows

encrypted keyword search. Like similar schemes, it

does not replace other means of encryption but pro-

vides additional data structures for its functionality

(see ﬁgure 1). The scheme builds upon the abstrac-

tion of documents, which are the units of granularity

for keyword search. Every document d

∈ D can con-

tain a number of keywords w ∈ W and is identiﬁed by

a unique ID i ∈ I. Authorized clients hold a secret key

priv

. The scheme is then deﬁned by the following

operations:

• Keygen(s) outputs a secret key K

priv

, where s is a

variable security parameter.

• Trapdoor(K

priv

,w) outputs a trapdoor T

for key-

word w using the secret key K

priv

• BuildIndex(d,K

priv

) outputs an encrypted index

for document d using the secret key K

priv

SecureKeywordSearchoverDataArchivesintheCloud-PerformanceandSecurityAspectsofSearchableEncryption

429

• SearchIndex(T

,d) takes a trapdoor for keyword

w and tests for a match in the index of document

d. If d contains w it outputs 1 and 0 otherwise.

Additionally, a pseudorandom function f :

{0,1}

∗

× {0,1}

→ {0,1}

is required. For a pre-

cise formal deﬁnition, e.g. with respect to bit string

lengths, please see the original publication (Goh et al.,

2003). We also omit the step of adding blinding bits

to the ﬁlter. To set up the scheme, security parameter

s, a number of hash functions r and a index size m are

chosen (for choice of m and r, see section 5.1). Then,

a secret key is generated by the Keygen operation, so

that K

priv

= (k1,....,k

) ← {0, 1}

To create a search index for a document d with a

set of keywords W

= {w

,...,w

} ⊂ W , BuildIndex

operation ﬁrst creates an empty bloom ﬁlter with a bit

array of length m. First, a trapdoor T

is calculated for

every keyword w using the Trapdoor operation, so

that T

= (t

,...,t

) = ( f (w,k

),..., f (w, k

)). This

results in a set of trapdoors : Using the set of trap-

doors T

,...,T

and the id of the document d, the

set of codewords C

,...,C

is calculated. For ev-

ery trapdoor T

the codeword C

is calculated so that

= (c

,...,c

) = ( f (id,t

),..., f (id,t

)). Then,

the ﬁlter of the document is populated by setting ev-

ery bit position ti 1 that is referenced by the trapdoors:

For every trapdoor C

, the bits at positions c

,...,c

are set to 1 (see ﬁgure 2).

To query a collection of documents for a keyword

w, the trapdoor T

is calculated using the Trapdoor

operation and sent to the server. To test whether a doc-

ument contains the keyword, the server calculates the

codeword C

using the trapdoor T

and the document

id. Using the trapdoor C

the server tests whether

all bit at positions c

,...,c

are set to 1. If so, the

document is sent back to the client as a match. This

process is applied to all documents in the collection.

In the Z-IDX scheme, a separate index data structure

is created per document. This accounts for a search

time that is linear over the number of documents, but

facilitates the administration of secure indices, as they

can be created stored alongside the documents. This

makes the addition or removal of documents a simple

operation.

From a more technical perspective, the above

steps can be described and implemented using a keyed

hash function such as HMAC-SHA1 (Krawczyk et al.,

1997), which is also used in our implementation (see

section 4). In a ﬁrst step, a keyword w is hashed with

all elements of the secret key k

,...,k

to obtain the

trapdoor vector. The elements of the trapdoor vec-

tor are each hashed again together with the document

identiﬁer id to obtain the codeword vector. Each of

the codeword vector elements is used as an index po-

sition to set a bit in the bloom ﬁlter bit array to 1.

4 SEARCHABLE ENCRYPTION

IN MongoDB

To evaluate the practical usability of searchable en-

cryption, we integrated the Z-IDX scheme into the

document oriented database MongoDB. In this sec-

tion, we explain why we chose MongoDB, present

the architecture of our prototype, introduce new com-

mands for secure keyword search and present imple-

mentation details.

4.1 Selection of a Database System

While the searchable encryption scheme Z-IDX can

be used standalone, its practical usability and perfor-

mance under realistic workloads can only be evalu-

ated if the scheme is used in conjunction with other

means of encryption and data handling. To do this, we

integrated Z-IDX into an existing DBMS. The choice

of a DBMS has to correspond to the basic properties

of the Z-IDX scheme – exact keyword matching as a

search mechanism and the notion of documents as the

basic units of granularity for searching.

To select a DBMS, we considered different

database paradigms: The most widespread type of

databases are relational databases – most of them

supporting the Structured Query Language (SQL).

This type of database has a long development history

and offers features such as transactional security, clus-

tering techniques and master-slave-conﬁgurations to

ensure availability. The SQL language allows detailed

queries, where speciﬁc data ﬁelds in the database

can be selected and returned base on complex crite-

ria based on structure or data ﬁeld values and logi-

cal combinations thereof. The expressive power of

the SQL language goes far beyond simple keyword

search. It is therefore difﬁcult to isolate queries that

can make use of searchable encryption. Additionally,

the ﬁne-grained selection of data ﬁelds does not cor-

respond well to the document-oriented approach of

searchable encryption.

Besides relational databases, other database types

have been developed under the umbrella term of

NoSQL databases. A very minimalistic approach are

key-value stores (e.g. Redis, Dynamo): They omit

many of the features known from SQL databases in

favor of simplicity and performance. However, the

complexity of data structures is severely limited. This

makes storing documents and associated indices dif-

ﬁcult or impossible.

CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience

430

Document-oriented databases, however, are well-

suited to implement searchable encryption. As

the name suggests, data is organized in containers

called documents as opposed to tables in relational

databases. These documents are the units of granu-

larity for search operations and can contain complex

data structures without adhering to a schema deﬁni-

tion. As this approach corresponds well to the prop-

erties of searchable encryption schemes, we chose to

add searchable encryption features to the open-source

document-oriented database MongoDB.

Floratou et al. (Floratou et al., 2012) compare

MongoDB to Microsoft SQL Server. They show that

relational databases may have better query perfor-

mance. However, MongoDB is optimized for stor-

ing data records across multiple machines and offers

efﬁcient load balancing, which makes it more suit-

able for cloud-based applications. Furthermore, the

increasing use of NoSQL databases in real world ap-

plications lead to an increasing demand for enhanc-

ing these databases with privacy technologies such as

searchable encryption.

4.2 Extended MongoDB Commands

As MongoDB is a document-oriented database, a doc-

ument is the primary unit of abstraction for organiza-

tion of data. A document does not adhere to a ﬁxed

schema and can store data in a JSON-like fashion

of ﬁeld-value pairs. Like in JSON, documents sup-

port a number of primitive data types (e.g. integer,

String) and a data structures like arrays. All of these

data structures can be nested. In addition to standard

JSON, MongoDB can also store binary data in ﬁelds.

Documents in MongoDB are stored in collec-

tions, these, in turn, are stored in a database. The

prime commands for data handling in collections are

insert() and find(). They accept a document as a

parameter. To make searchable encryption explicitly

available, we introduced two additional commands:

• The insertSecure() can be used to insert doc-

uments into a collection using searchable encryp-

tion. Using this command, every array of strings

in the document is removed and its content used as

keywords. The contained strings are inserted into

a Z-IDX ﬁlter or encrypted search. Every other

datatype remains untouched.

• The findSecure() command triggers encrypted

search over all documents of a collection. As a

parameter, it takes a keyword embedded in a doc-

ument, e.g.: findSecure({keyword: ’foo’})

4.3 Architecture and Implementation

To integrate searchable encryption into MongoDB,

we chose to add the extended functionality to the

server and the command line client. An overview

of the architecture of MongoDB server and client is

given in ﬁgure 3. In theory, it is possible to add

searchable encryption to MongoDB modifying only

the client but not the server. However, this leads to

a disproportionately high increase in communication

overhead as per-document operations would have to

be carried out on the client, each requiring the trans-

mission of the documents Z-IDX data structures.

MongoDB Shell Client

MongoDB

Server

Z-IDX

Module

C++ Driver

Z-IDX

Module

JavaScript Shell

User

Secret Key File

Figure 3: Architecture of MongoDB Server and Client.

The MongoDB command line client is comprised

of a JavaScript shell that uses a core driver written

in C++. The client connects to the server, which

is also written in C++. To provide searchable en-

cryption functionality, we implemented the Z-IDX

scheme (see section 3) and additional helper functions

in a separate module that is compiled both into the

server and the C++ driver of the client (Z-IDX Mod-

ule, see ﬁgure 3). As suggested by Goh, we apply data

compression (zlib) to the index data structures before

transmission over the network. As these data struc-

tures are very sparse, the compression works very ef-

fectively and the additional compute overhead is eas-

ily outweighed by reduced transmission times in most

settings.

To integrate the functionality, we made the follow-

ing modiﬁcations: The JavaScript shell is modiﬁed

to read the secret key information from a ﬁle, which

has to be passed as a parameter at startup. If a se-

cure search or insert request is identiﬁed, the request

is modiﬁed to include the secret key information. This

information is stored in a dedicated _zidx ﬁeld in the

query. After this, the request is passed to the clients’

C++ driver. The C++ driver is modiﬁed to recognize

queries that contain Z-IDX key information injected

by the JavaScript shell. For inserts, a Z-IDX ﬁlter is

built and populated with the contained strings of ev-

ery string array in the document. Subsequently, the

string arrays and the key information are removed and

the command is passed on to the server. For a search

query, the C++ driver uses the key to compute trap-

doors for every search keyword. The trapdoors are

inserted, the key is removed and the query is passed

SecureKeywordSearchoverDataArchivesintheCloud-PerformanceandSecurityAspectsofSearchableEncryption

431

on to the server. The MongoDB server is modiﬁed

to process the search queries. For the trapdoors of a

search query, the server generates the corresponding

codewords using the document id. These codewords

are then checked against the bloom ﬁlters of a doc-

ument to test for a match. This architecture and im-

plementation makes searchable encryption available

without affecting non-encrypted use of the database,

as regular MongoDB commands are processed as ex-

pected.

5 PERFORMANCE EVALUATION

The use of encrypted search functionality introduces

an overhead in computation, storage and data trans-

mission. Since speed and throughout are critical fac-

tors for databases, we present performance measure-

ments of our approach in this section. The ﬁgures

allow to evaluate the practicability of searchable en-

cryption in databases for real-life scenarios.

To assess the performance impact of our approach,

we ran insert and search queries in encrypted and un-

encrypted settings under various parameters settings

(dictionary size, false positive rate) and analyzed the

memory footprint of the additional data structures of

Z-IDX. To avoid synthetic test data, we chose the pub-

licly available Enron corpus – a collection of emails

which we use as documents. All benchmarks were

run on a Intel Core i5-3470 machine with 8GB main

memory, running Ubuntu 12.04 LTS.

5.1 Memory Footprint of Z-IDX Filters

As the encrypted ﬁlters are added to every docu-

ment, they add overhead to communication and stor-

age footprint. They are therefore a crucial factor that

inﬂuences the performance of a database using this

scheme.

The size of these data structures is determined

by the desired false positive rate f

and the num-

ber of unique keywords to be represented by the ﬁl-

ters n. From the false positive rate f

, the num-

ber of hash functions r is determined by calculating

r = −log

( f

). From r, the number of bits m in the

ﬁlter can be determined by calculating m = nr/ln2

. In practice, these data structures can become quite

large. This is especially unfavourable in settings with

large numbers n of distinct keywords and small doc-

ument sizes, as the ﬁlter sizes can easily exceed the

size of the original documents.

To improve the efﬁciency of the scheme, data

compression can be used on the ﬁlters (as suggested

by Goh). While ﬁlter compression decreases storage

and communication overhead, it also introduces ad-

ditional steps of computation on the client and server

side: Upon document insertion, ﬁlters have to be com-

pressed and decompressed for every search operation.

This represents a tradeoff between data size and com-

putational overhead.

To investigate this issue, we ﬁrst tested the ef-

fectiveness of compression on indexes. In practice,

these ﬁlters are bit array that contain mostly 0’s and

sparsely distributed 1’s (depending on the number of

contained keywords). To determine the achievable

compression ratio, we used a set of 1000 documents

from the Enron corpus containing 127.5 keywords on

average. Assuming a set of 100000 distinct keywords

and a false positive rate of 0.0001% leads to an un-

compressed ﬁlter size of 252472 bytes. We imple-

mented the compression of ﬁlters using the free zlib

compression library. Using the zlib standard compres-

sion strategy, the average compression ratio achieved

is 0.02 with the given parameters. Using a run-length

encoding strategy that exploits the sparse property of

the ﬁlters, compression becomes even more effective

with an average compression ratio of 0.0154. This

means that using compression, ﬁlter sizes can be con-

siderably reduced in size (here: to 1.54% of their

original size, average size of compressed ﬁlters 3889

bytes).

Our benchmarking results show that using ﬁlter

compression dramatically speeds up database opera-

tions even over fast network connections (100 Mbit/s

speed). This means that the overhead for data com-

pression is by far outweighed by the advantage in net-

work transmission speed due to smaller ﬁlters. There-

fore, we use RLE-based ﬁlter compression as a de-

fault in all subsequent measurements.

200 400 600 800

5000 10000

15000 20000

Number of Keywords

Compressed filter size (bytes)

Figure 4: Relationship between number of document key-

words and compressed ﬁlter size.

It can be observed that the size of compressed ﬁl-

ters is closely correlated with the number of repre-

sented keywords (see ﬁgure 4): Documents with few

http://zlib.net/

CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience

432

keywords have small compressed ﬁlters while more

keywords produce larger sizes. This means that a

trade-off of the Z-IDX scheme is mitigated: To ac-

commodate large sets of distinct keywords without

false-positives, large ﬁlter sizes are required. These

large ﬁlters take up of large amounts of memory

– even for small documents with few or no key-

words at all. However, using compression, ﬁlter

sizes can be generously chosen as compressed ﬁlters

remain compact, depending on the number of key-

words in the document. In fact, using the settings

above, compressed ﬁlter sizes are 3389 bytes on aver-

age. When increasing the number of unique keywords

from 100000 to a million (tenfold) , the average size is

only 6648 bytes on average (only a twofold increase).

5.2 Query Performance

To assess the performance of the scheme, we evalu-

ated insert and search performance of our Z-IDX im-

plementation embedded in MongoDB. To obtain re-

alistic results, we tested our setup under two differ-

ent network proﬁles: The LAN proﬁle corresponds to

the typical properties of a wired local network (2ms

ping, 100 Mbit/s), the WAN proﬁle corresponds to the

properties of a domestic internet connection in Ger-

many (20ms ping, 10 Mbit/s). For reference, the same

benchmarks were also conducted with a Localhost

proﬁle, where the network delays are essentially non-

existent. The LAN and WAN proﬁles were generated

by using network link conditioning on the machines’

loopback network device, using Linux’ tc command.

All benchmarks were conducted using a false posi-

tive rate of 0.001 and a maximum dictionary size of

10000.

Insert Query Performance. To assess the perfor-

mance of insert queries, we inserted a collection of

10000 documents from the Enron corpus in batches of

100. We ran every insert query 100 times and took the

mean as our measurement value. The results for these

queries in the Localhost, LAN and WAN proﬁles for

encrypted and unencrypted operation are shown in

ﬁgure 5. The longer duration of encrypted operation

is explained by the additional steps required on the

client: Before submission of a document, a Z-IDX ﬁl-

ter has to be created using the document’s keywords

and the document content has to be encrypted. The Z-

IDX ﬁlter introduces data which slightly increases the

time of data transmission. On the server, no additional

steps have to be executed on insert. Our experiments

show that the performance penalty for encryption in

insert queries is indeed moderate: In the Localhost-

and LAN-settings, the insert time is about doubled

5000"

10000"

15000"

20000"

25000"

30000"

35000"

Localhost" LAN" WAN"

!"#$%&'()$'*)+,,+#$-."/#0'

Unencrypted"

Encrypted"

Figure 5: Benchmark: Insert of 10000 documents.

compared to the unencrypted setting. In the WAN

setting, where network performance has a larger ef-

fect, the duration of encrypted and unencrypted insert

queries are nearly the same.

Search Query Performance To determine the per-

formance of search queries, we issued a search query

with a randomly chosen keyword on the same doc-

ument collection as used in the insert queries. We

ran every search query 100 times and took the mean

as our measurement value. The results for these

queries in the Localhost, LAN and WAN proﬁles for

encrypted and unencrypted operation are shown in

ﬁgure 6. The duration of encrypted search queries

100"

200"

300"

400"

500"

600"

700"

800"

Localhost" LAN" WAN"

Query&'me&(milliseconds)&

Unencrypted"

Encrypted"

Figure 6: Benchmark: Query over 10000 documents.

is increased signiﬁcantly compared to unencrypted

operation, due to a fundamental difference between

search implementation: Searching in an unencrypted

database is usually carried out using an inverted

index, where the matching documents for a given

keyword can be looked up with linear complexity

(O(1)). In encrypted operation using the Z-IDX-

scheme, search complexity is linear in the number of

documents in the collection (O(n), n=number of doc-

uments). As a result, the unencrypted search time is

very small (0,13 ms in the Localhost setting, 2,32 ms

LAN, 20,37 ms WAN) when compared to encrypted

operation and mainly determined by the network la-

tency. In contrast, encrypted searches took around

SecureKeywordSearchoverDataArchivesintheCloud-PerformanceandSecurityAspectsofSearchableEncryption

433

half a second (≈ 530 ms), with little variation depend-

ing on network performance, as only little data had to

be transmitted.

5.3 Implications for Practical Use

Our measurements have shown that the performance

penalty for using the Z-IDX searchable encryption

scheme in a database is very unevenly distributed:

While the performance penalty for insert queries is

almost negligible in under realistic conditions (WAN

proﬁle), the penalty for search queries is tremendous

by comparison. At the same time, the query per-

formance varies greatly depending on collection size

(linear effort) and ﬁlter parameters: A search query

on a 10000-documents-collection in our experiments

took between 219 ms ( f

= 0.01, n = 1000) and 4612

ms ( f

= 0.0001, n = 100000).

6 SECURITY

The motivation for using searchable encryption

schemes such as Z-IDX is to protect the conﬁden-

tiality of information that is stored on untrusted in-

frastructures (e.g. cloud providers). In this section,

we give a qualitative evaluation of the security im-

plications when searchable encryption schemes are

used to search over encrypted data stored on a remote

server. This security evaluation is generally applica-

ble to searchable encryption schemes that correspond

to the abstract model given in section 6 and there-

fore not speciﬁc to Goh’s Z-IDX scheme (Goh et al.,

2003), unless explicitly noted otherwise.

The security of computer systems constituted by

the attributes of conﬁdentiality, integrity and avail-

ability (as deﬁned in the ITSEC criteria (ITSEC,

1991), see also (Avizienis et al., 2004)). As the

purpose of searchable encryption is to protect the

searched keywords from being disclosed to unautho-

rized parties, we focus our evaluation on the property

of data conﬁdentiality of search keywords.

Abstract System Model. For the security evalua-

tion, we assume a setup as shown in ﬁgure 7 (see also

(Islam et al., 2012)). A server holds a set of n docu-

ments Doc

,...,Doc

. It also holds an encrypted data

structure which contains a mapping for every key-

word w ∈ W to all documents containing w. To query

the encrypted index, the client generates a trapdoor T

and sends it to the server over the network. Using this

trapdoor, the server can determine all documents that

contain keyword w and sends them back to the client

over the network. The mapping between keywords

and trapdoors w 7→ T

is deterministic, i.e. under the

same encryption key there exists exactly one trapdoor

for every keyword w. These properties apply to

most symmetric searchable encryption schemes.

client network link server

encrypted

index

Trapdoor

Matching Documents

client

trusted honest-but-curious

Figure 7: Encrypted Search on a remote system: Abstract

Model.

Attacker Model. Attacks to learn the plaintext of

keywords and their association with encrypted docu-

ments can generally be undertaken in any part of the

architecture. Attacks on the client are the the most

dangerous, as clients hold the cryptographic key and

handle unencrypted information. We assume autho-

rized users on these clients to be trustworthy. For

the operator of the network link and the server we

assume a honest-but-curious attacker model (see e.g.

(Lindell et al., 2008)) : These operators will generally

execute programs and transmit information correctly

and faithfully, but can record arbitrary information

and perform additional calculations on it. Under this

adversarial model, data conﬁdentiality is challenged

while integrity and availability are not affected.

6.1 Threats to Keyword Conﬁdentiality

To illustrate the threats to the conﬁdentiality of key-

words in the system we use the ADTree model

(Attack-Defense-Trees, see (Kordy et al., 2012; Bag-

nato et al., 2012)), which build upon the concept of

attack trees (Schneier, 1999). Attack trees are used

to model the threats to a speciﬁc security property

of a system and their logical interdependencies. In-

dividual threats are represented as leaves of the tree

and are connected by AND and OR operators to the

root of the tree, which represents a speciﬁc security

property. The attack of the system that corresponds

to a speciﬁc threat is indicated in the model by as-

signing a boolean TRUE value of the node in the tree.

If a combination of attacks results in a propagation

of a TRUE value to the root node the security prop-

erty is considered to be breached. By evaluating the

attack tree, sets of possible attacks can be derived.

The ADTree model extends attack trees by introduc-

ing and explicitly modeling countermeasures, which

can be employed to mitigate or prevent attacks. In

ﬁgure 8, an ADTree shows threats for keywords con-

ﬁdentiality in searchable encryption schemes and ac-

CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience

434

Countermeasure

Attack

Disclose

Keywords

Attacks on

Client

Data Theft by

Authorized

Users

Attacks by

Unauthorized

Users

Client

Security

Physical

Security

OS Security

Mechanisms

Encryption of

local Storage

Network

Data Sniffing

End-to-e nd

Encryption

Attacks on

Server

Break

Encryption

Scheme

Statistical

Inference

Introduce

Noise

Figure 8: Attack-Defense-Tree: Threats for Conﬁdentiality

of Keywords.

cording countermeasures. Attacks to learn keywords

can be undertaken on the client, on the network and

the server which holds the encrypted index. In the

following sections, we discuss the relevance and im-

plications of the shown threats and their countermea-

sures.

6.2 Attacks on the Client

Attacks on the client are potentially severe as the

client handles plaintext data and holds the crypto-

graphic key for the searchable encryption scheme. By

obtaining the key, an attacker can uncover document-

keyword associations by generating valid queries and

launching a dictionary attack against either the server

or against intercepted trapdoors. Theft of data or

keys cannot by authorized users cannot be prevented.

However, in our attacker model, we assume the au-

thorized users to be trustworthy. To protect the assets

of the client systems against unauthorized users, dif-

ferent methods can be employed: Physical security

measures can prevent unauthorized users from get-

ting physical access to client machines. The security

mechanisms of the clients operating system can en-

sure that only authorized users can log onto the ma-

chines directly or via network. Finally, data on the

clients mass storage can be protected by hard disk en-

cryption.

6.3 Network Data Snifﬁng

Interception of data exchanged by searchable encryp-

tion protocols could threaten the conﬁdentiality of

keywords as statistical properties of the trapdoor-

keywords-associations can be exploited (for more de-

tail, see section 6.4). If general security ﬂaws of the

underlying scheme become known, these could also

be exploited. Data snifﬁng on the network can how-

ever easily be prevented by encryption of network

trafﬁc between client and server (e.g. by using Trans-

port Layer Security).

6.4 Attacks on the Server

In general, threats that originate from network data

snifﬁng also exist on the server, as the entire commu-

nication of the scheme is observable. However, as the

searchable encryption scheme has to be processed on

the server (i.e. matching of trapdoors to documents),

an additional layer of encryption is not an option. In

addition, the server also has direct access to the en-

crypted index, which could make attacks targeting

this data structure very efﬁcient. As the server can

also monitor the program execution, side-channel at-

tacks are theoretically possible (e.g. timing attacks).

In the following, we discuss the implications of these

threats.

Attacks to the Encryption Scheme. The conﬁden-

tiality of the keywords depends on the trust in the cho-

sen underlying searchable encryption scheme. In the

ﬁrst place, it is desirable to use algorithms that are

openly published and examined by cryptographic ex-

perts. In general, searchable encryption schemes are

an active ﬁeld of research, with many constructions

from the recent past (see section 2) that need more

evaluation before they can be considered mature.

The Z-IDX scheme by Goh is among the oldest

searchable schemes with no general attacks to the

scheme published. The construction of the scheme is

based on keyed hash functions, which are well exam-

ined and proved cryptographic tools (HMAC SHA-1

(Krawczyk et al., 1997)). The scheme fulﬁlls three

security properties suggested by (Song et al., 2000):

It supports hidden queries as the generated trapdoors

do not reveal the keyword. Valid trapdoors cannot be

generated without possession of the secret key (con-

trolled searching). Both properties are ensured by us-

ing a keyed hash function. Finally, the scheme ful-

ﬁlls the property of query isolation which means that

the server learns nothing more than the set of match-

ing documents about a query. This security prop-

erty is formalized as the IND-CKA (Semantic Secu-

rity Against Adaptive Chosen Keyword Attack) prop-

erty: An adversary is given two documents D

and D

and an index which encodes the keywords of one of

these documents. If the adversary cannot determine

which documents keywords are encoded in the index

with a probability signiﬁcantly better than

the in-

dex is considered IND-CKA-secure. To the best of our

knowledge, no attacks that break IND-CKA-security

of the Z-IDX scheme have been published to date.

Statistical Inference. Attacks using statistical in-

ference are a possible against all searchable encryp-

tion schemes that follow the basic model outlined in

SecureKeywordSearchoverDataArchivesintheCloud-PerformanceandSecurityAspectsofSearchableEncryption

435

section 6. The threat of these attack is not based

on weaknesses in the cryptographic constructions of

searchable encryption schemes but is a direct conse-

quence of the basic characteristics of such schemes.

Under the same secret key K

priv

, a keyword w is al-

ways mapped to the same trapdoor T

. This allows the

server to observe tuples (w,{D

,...,D

}), i.e. combi-

nations of encrypted queries and the set of matching

documents, which leak statistical information: The

sever can learn the frequency of certain queries as

they occur over time and learn about the occurrence

and frequency of distinct keywords in the document

collection. While statistical information does not di-

rectly reveal keywords, it can be exploited to infer the

semantics or plaintext of keyword using background

knowledge about the data exchanged in the system.

When handling medical data for example, very ac-

curate assumptions about the prevalence of a speciﬁc

medical condition among a population can be made

using public sources of information. If this prevalence

is expressed using a keyword and no other keyword

in the document set possesses the same frequency, it

is easy to infer the meaning of this keyword. While

the given example might be trivial, statistical attack

can pose a serious threat to the conﬁdentiality of key-

words. We review two practical attacks that have been

published:

Search Pattern Leakage in Searchable Encryp-

tion: Attacks and New Constructions. (Liu et al.,

2013) propose an attack based on the frequency of

search patterns. The salient feature of the approach

is that the frequency f

at which a keyword q occurs

is sampled over time, resulting in a frequency vec-

tor V

= {V

,...,V

} for a speciﬁc keyword. Back-

ground knowledge for a dictionary of keywords D =

,...,w

} is drawn from external sources (the au-

thors propose Google Trends) and represented as fre-

quency vectors V = {V

,...,V

}. To infer the plain-

text of a keyword, a distance measuring function

Dist(V,V

) is used to determine the vector ∈ V with

the smallest distance to V

– the corresponding key-

word is then assumed to be q. The attack is amended

by an active approach, where the background knowl-

edge is adapted to a speciﬁc scenario (e.g. health-

care) to improve accuracy. To test the accuracy of

their attack, they use frequency vectors obtained from

Google Trends for the 52 weeks of the year 2011 and

add varying levels of gaussian noise to simulate user

queries. They show that under certain circumstances

(e.g. keyword dictionary size of 1000, limited level

of noise) it is easy to guess the keyword with a very

high accuracy. They also present mitigation strate-

gies, which are based on inserting random keywords

along with every query, but do not consider the actual

document matching on the server.

Access Pattern Disclosure on Searchable Encryp-

tion: Ramiﬁcation, Attack and Mitigation. (Is-

lam et al., 2012) propose a statistical attack which

is based on the frequency at which keywords appear

in the document set. As background knowledge, in-

formation about the probability of two keywords oc-

curring in the same document is assumed. This in-

formation can be obtained by scanning public doc-

ument sources for a dictionary keywords k

,...,k

It is represented by a m × m matrix M, where M

i, j

contains the probability of keywords k

and k

occur-

ring in the same document. The attacker then tries

to ﬁnd an order of encrypted queries q

,...,q

whose

results set produce another matrix which is similar to

M. This sequence that produces the matrix most sim-

ilar to M is considered the result of the attack and re-

veals keywords by aligning the vectors of queries and

keywords so that q

corresponds to m

. The prob-

lem can be formalized by expressing the closeness

between matrices as an arithmetic distance. The au-

thors use simulated annealing to determine a keyword

sequence that minimizes this distance. The quality

of the attack is the percentage of keywords that are

guessed correctly. This percentage is improved if the

background knowledge also includes a set of known

query-trapdoor associations – this is however not re-

quired. With 15% known queries of 150 observed

queries, their attack was able to infer close to 100% of

a set of 500 keywords correctly. To counteract the pre-

sented attack, they also suggest the insertion of noise

to hide statistical properties of the query-document

associations. Encrypted index structures are consid-

ered (α,0)-secure if for every keyword there are α −1

keywords that appear in the same set of documents -

limiting an attackers probability of correctly inferring

a keyword to

at best.

6.5 Implications for Practical Use

The threat model in section 6.1 shows that attacks on

the conﬁdentiality are possible in every part of the

system. However, as shown in the previous sections,

attacks by unauthorized users on the client and the

network can effectively mitigated by access control

and encryption. The most relevant threat is the pos-

sibility of inferring keywords by exploiting statistical

properties that can be observed by monitoring queries.

The threat posed by statistical inference attacks de-

pends strongly on the set of keywords and their dis-

tribution in the document set. Statistical inference at-

tacks are only a minor concern if the individual key-

CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience

436

words exhibit very similar statistical properties, e.g.

serial numbers that are evenly distributed across doc-

uments. However, attributes with statistical properties

that could be available as background knowledge (e.g.

medical diagnoses) to an attacker need to be treated

with great caution and might require noise insertion.

7 CONCLUSION

In this paper, we evaluated the practical usability of

searchable encryption for data archives in the cloud,

illustrated by embedding an implementation of Goh’s

searchable encryption scheme into MongoDB. We

found that the use of compression on the additional

data structures keeps the data size at tolerable lev-

els and relative to the number of embedded search

keywords. Performance benchmarks revealed that for

insert operations under typical network parameters,

the additional overhead for insert operations is neg-

ligible compared to unencrypted operation. Search

queries however exhibit a considerable impact for en-

crypted operation, as search operations are linear to

the number of documents in Goh’s scheme. However,

the measured durations of encrypted queries could

be acceptable for interactive use where the added se-

curity is required. To evaluate the security proper-

ties of searchable encryption, we presented threats

to keyword conﬁdentiality as an attack-defense-tree

model, which applies to most searchable encryption

schemes. The most relevant threat comes from in-

ference attacks, which are possible if the keywords

exhibit strong statistical properties which can be ex-

tracted using background knowledge. In such cases,

noise insertion techniques can be used to mitigate

such attacks.

Further research could investigate the performance

more recent constructions of searchable encryption

schemes with constant search complexity (e.g. (Ka-

mara et al., 2012)) and schemes that provide extended

search capabilities, such as range queries (see e.g.

(Boneh and Waters, 2007; Wang et al., 2011)).

ACKNOWLEDGEMENTS

The authors would like to thank Martin Kreichgauer

for providing the prototypical implementation of the

Z-IDX scheme and the MongoDB integration.

REFERENCES

Abdalla, M., Bellare, M., Catalano, D., Kiltz, E., Kohno,

T., Lange, T., Malone-Lee, J., Neven, G., Paillier, P.,

and Shi, H. (2005). Searchable encryption revisited:

Consistency properties, relation to anonymous IBE,

and extensions. In Advances in Cryptology–CRYPTO

2005, pages 205–222. Springer.

Agrawal, R., Kiernan, J., Srikant, R., and Xu, Y. (2004).

Order preserving encryption for numeric data. In Pro-

ceedings of SIGMOD ’04 International Conference on

Management of Data, pages 563–574. ACM.

Arasu, A., Blanas, S., Eguro, K., Joglekar, M., Kaushik, R.,

Kossmann, D., Ramamurthy, R., Upadhyaya, P., and

Venkatesan, R. (2013). Secure database-as-a-service

with cipherbase. In Proceedings of SIGMOD ’13

International Conference on Management of Data,

pages 1033–1036. ACM.

Avizienis, A., Laprie, J.-C., Randell, B., and Landwehr, C.

(2004). Basic concepts and taxonomy of dependable

and secure computing. Dependable and Secure Com-

puting, IEEE Transactions on, 1(1):11–33.

Bagnato, A., Kordy, B., Meland, P. H., and Schweitzer, P.

(2012). Attribute decoration of attack–defense trees.

International Journal of Secure Software Engineering

(IJSSE), 3(2):1–35.

Bajaj, S. and Sion, R. (2011). Trusteddb: A trusted hard-

ware based database with privacy and data conﬁden-

tiality. In Proceedings of SIGMOD ’11 International

Conference on Management of Data, pages 205–216.

ACM.

Bloom, B. H. (1970). Space/time trade-offs in hash coding

with allowable errors. Communications of the ACM,

13(7):422–426.

Boneh, D., Di Crescenzo, G., Ostrovsky, R., and Per-

siano, G. (2004). Public key encryption with keyword

search. In Advances in Cryptology-Eurocrypt 2004,

pages 506–522. Springer.

Boneh, D. and Waters, B. (2007). Conjunctive, subset, and

range queries on encrypted data. In Theory of cryp-

tography, pages 535–554. Springer.

Byun, J. W., Rhee, H. S., Park, H.-A., and Lee, D. H.

(2006). Off-line keyword guessing attacks on recent

keyword search schemes over encrypted data. In Se-

cure Data Management, pages 75–83. Springer.

Curtmola, R., Garay, J., Kamara, S., and Ostrovsky, R.

(2006). Searchable symmetric encryption: improved

deﬁnitions and efﬁcient constructions. In Proceedings

of the 13th ACM conference on Computer and com-

munications security, pages 79–88. ACM.

Floratou, A., Teletia, N., DeWitt, D. J., Patel, J. M., and

Zhang, D. (2012). Can the elephants handle the nosql

onslaught? Proc. VLDB Endow., pages 1712–1723.

Gentry, C. (2009). Fully homomorphic encryption using

ideal lattices. In Proceedings of the 41st annual ACM

symposium on Theory of Computing, pages 169–178.

ACM.

Goh, E.-J. et al. (2003). Secure indexes. IACR Cryptology

ePrint Archive, 2003:216.

SecureKeywordSearchoverDataArchivesintheCloud-PerformanceandSecurityAspectsofSearchableEncryption

437

Hore, B., Mehrotra, S., and Tsudik, G. (2004). A privacy-

preserving index for range queries. In Proceedings of

the 13th International Conference on Very Large Data

Bases, VLDB ’04, pages 720–731.

Islam, M., Kuzu, M., and Kantarcioglu, M. (2012). Access

pattern disclosure on searchable encryption: Ramiﬁ-

cation, attack and mitigation. In Network and Dis-

tributed System Security Symposium (NDSS’12).

ITSEC (1991). Information technology security evaluation

criteria (itsec): Preliminary harmonised criteria. Tech-

nical report, Commission of the European Communi-

ties.

Kamara, S. and Lauter, K. (2010). Cryptographic cloud

storage. Financial Cryptography and Data Security,

pages 136–149.

Kamara, S., Papamanthou, C., and Roeder, T. (2012). Dy-

namic searchable symmetric encryption. In Proceed-

ings of the 2012 ACM conference on Computer and

communications security, pages 965–976. ACM.

Kordy, B., Mauw, S., Radomirovi

c, S., and Schweitzer, P.

(2012). Attack-defense trees. Journal of Logic and

Computation.

Krawczyk, H., Bellare, M., and Canetti, R. (1997). HMAC:

Keyed-Hashing for Message Authentication. RFC

2104 (Informational). Updated by RFC 6151.

Lindell, Y., Pinkas, B., and Smart, N. P. (2008). Imple-

menting two-party computation efﬁciently with secu-

rity against malicious adversaries. In Security and

Cryptography for Networks, pages 2–20. Springer.

Liu, C., Zhu, L., Wang, M., and an Tan, Y. (2013). Search

pattern leakage in searchable encryption: Attacks and

new constructions. Cryptology ePrint Archive, Report

2013/163.

Popa, R. A., Redﬁeld, C. M. S., Zeldovich, N., and Balakr-

ishnan, H. (2011). Cryptdb: Protecting conﬁdentiality

with encrypted query processing. In Proceedings of

the Twenty-Third ACM Symposium on Operating Sys-

tems Principles, SOSP ’11, pages 85–100. ACM.

Rivest, R. L., Adleman, L., and Dertouzos, M. L. (1978).

On data banks and privacy homomorphisms. Founda-

tions of secure computation, 32(4):169–178.

Schneier, B. (1999). Attack trees. Dr. Dobb’s journal,

24(12):21–29.

Shmueli, E., Waisenberg, R., Elovici, Y., and Gudes,

E. (2005). Designing secure indexes for encrypted

databases. In Proceedings of the 19th Annual IFIP

WG 11.3 Working Conference on Data and Applica-

tions Security, DBSec’05, pages 54–68.

Song, D. X., Wagner, D., and Perrig, A. (2000). Practical

techniques for searches on encrypted data. In Secu-

rity and Privacy, 2000. S&P 2000. Proceedings. 2000

IEEE Symposium on, pages 44–55. IEEE.

Van Dijk, M., Gentry, C., Halevi, S., and Vaikuntanathan,

V. (2010). Fully homomorphic encryption over the in-

tegers. Advances in Cryptology–EUROCRYPT 2010,

pages 24–43.

Wang, S., Agrawal, D., and El Abbadi, A. (2011). A com-

prehensive framework for secure query processing on

relational data in the cloud. In Proceedings of the

8th VLDB Workshop on Secure Data Management,

SDM’11, pages 52–69, Berlin, Heidelberg. Springer-

Verlag.

Yang, Z., Zhong, S., and Wright, R. N. (2006). Privacy-

preserving queries on encrypted data. In Proceedings

of the 11th European Conference on Research in Com-

puter Security, ESORICS’06, pages 479–495.

CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience

438