individual genome sequencing.
As mentioned in Subsection 1, the cost to se-
quence an entire human genome continues to fall, the
potential exists for rapid advances in wellness and
health care resulting from this new technology. Es-
sential to achieving those advances is the need to out-
source, compare, and aggregates the genome dataset.
However, as the ease with which the acquisition and
outsourcing of genome sequencing information in-
creases, so we will have questions and concerns about
privacy, security, and efficiency.
1.1 Contributions of this Paper
In this paper, we propose an architecture and its
application in personalized medicine case scenario.
We test homomorphic encryption techniques to as-
sist in improving the strength of their privacy at
non-prohibitive performance cost. We experimen-
tally analyse our personalized medicine case sce-
nario architecture using the HElib library, and HElib
achieves near practical computation cost. We show
that the proposed solution have used real genomic
rules, which have been generated by the geneticists
in our team (authors 2 and 3). Our main contribu-
tions are (i) to keep genomic datasets secure while
still enabling cloud-based analyses needed to make
meaningful diagnosis, (ii) provide acceptable level
of privacy requirements in each step of handling of
genomic datasets, collecting, analyzing, storing or
sharing the genetic informations and (iii) provide a
characterization of various threat models that are ad-
dressed at each step.
The paper is organized as follows. In Section 2,
we review the current literature relative to the chal-
lenges mentioned in Section 1. Section 3 provides an
architecture, including descriptions of various main
components. Section 4 gives a detailed description of
homomorphic encryption method used in our method-
ology. Section 5 presents an insider threat model for
our architecture. Section 6 demonstrates the exper-
imental results, and indicates homomorphic encryp-
tion overhead is not prohibitive for this application.
Section 7 summarizes and presents conclusions.
2 THE CURRENT SOLUTIONS
Privacy issues caused by forensic, medical and other
uses of genomic dataset have been studied in the past
few years (Jiang et al., 2014), and (Naveed et al.,
2015). Homomorphic encryption technique is quickly
becoming more relevant due to its great potential for
privacy computation on encrypted genomic datasets.
This technique has a number of other advantages, al-
lowing for more flexible case scenarios, and requiring
less interaction, thereby reducing the communication
complexity. The cryptographic overhead consists of
the time to perform operations for each gate of the
circuit as well as other maintenance operations. Un-
fortunately, it is hard to characterize simply the cryp-
tographic overhead of fully homomorphic encryption
(FHE) because there are a lot of parameters that affect
its performance, such as the multiplicative depth, the
security parameter, the plaintext size, the exact FHE
scheme used, the performance of various operations
in the finite fields used. Lepoint and Naehrig (Le-
point and Naehrig, 2014) and Halevi (Halevi, 2013)
provide performance measurements for various set-
tings of these parameters. A number of key optimiza-
tions and batch techniques (Halevi and Shoup, 2014),
(Zhou and Wornell, 2014) have been introduced to re-
duce overall computation complexity and increase ef-
ficiency of these homomorphic based schemes. Re-
cently, homomorphic encryption techniques (Ayday
et al., 2014), (Ayday et al., 2013b), (Lauter et al.,
2015) have been used to encrypt genomic datasets in
such a way that storage can be outsourced to an un-
trusted cloud, and the datasets can be computed on
in a meaningful way in encrypted form, without re-
quiring access to decryption keys. These protocols
have some drawbacks such as being computationally
intensive, leaking more than necessary and being un-
scalable; mainly due to the very large size of genomic
datasets. However, a number of optimization tech-
niques (Halevi and Shoup, 2014), (Zhou and Wornell,
2014) have been presented to overcome the limita-
tions of using homomorphic based solutions. Build-
ing practical systems that compute on encrypted ge-
nomic datasets are a challenging task. One reason
is that homomorphic encryption method remains too
slow for running arbitrary functions or for enabling
the complex systems we have today. Another rea-
son is that many systems take advantage of fast search
data structures (such as database indexes), and a prac-
tical system must preserve this performance over en-
crypted dataset.
We present a cryptographic solution for genomic
datasets storage and outsourcing, and maintaining
patient privacy. All encrypted genomic datasets
are stored in an untrusted cloud server. To allow
meaningful computation on the encrypted genomic
datasets, we use HElib (Halevi, 2013), (Halevi and
Shoup, 2014). Specially, we take basic genomic algo-
rithms which are commonly used in genetic associa-
tion studies and show how they can be made to work
on encrypted genotype and phenotype datasets. We
also tackled the insider attack situation in an untrusted
ICISSP 2017 - 3rd International Conference on Information Systems Security and Privacy
326