Security for Distributed Deep Neural Networks: Towards Data

Conﬁdentiality & Intellectual Property Protection

Laurent Gomez

, Marcus Wilhelm

, Jos

e M

arquez

and Patrick Duverger

SAP Security Research, SAP Global Security, France

Hasso Plattner Institute, University of Potsdam, Germany

SAP Portfolio Strategy & Technology Adoption, SAP SE, Germany

Logistic & IT Services, City of Antibes Juan-les-Pins, France

Keywords:

Distributed Systems, Neural Networks, Intellectual Property, Data Protection & Privacy, Fully Homomorphic

Encryption.

Abstract:

Current developments in Enterprise Systems observe a paradigm shift, moving the needle from the backend to

the edge sectors of those; by distributing data, decentralizing applications and integrating novel components

seamlessly to the central systems. Distributively deployed AI capabilities will thrust this transition.

Several non-functional requirements arise along with these developments, security being at the center of the

discussions. Bearing those requirements in mind, hereby we propose an approach to holistically protect dis-

tributed Deep Neural Network (DNN) based/enhanced software assets, i.e. conﬁdentiality of their input &

output data streams as well as safeguarding their Intellectual Property.

Making use of Fully Homomorphic Encryption (FHE), our approach enables the protection of Distributed

Neural Networks, while processing encrypted data. On that respect we evaluate the feasibility of this solution

on a Convolutional Neuronal Network (CNN) for image classiﬁcation deployed on distributed infrastructures.

1 INTRODUCTION

1.1 Motivation

Until now, the backend (on-prem & cloud) deploy-

ments were considered as the single source of truth &

unique point of access in regards of Enterprise Sys-

tems (ES). Nevertheless, a paradigm shift has been

recently observed, by the deployment of ES assets to-

wards the Edge sectors of the landscapes; by distribut-

ing data, decentralizing applications, de-abstracting

technology and integrating edge components seam-

lessly to the central backend systems.

Capitalizing on recent advances on High Perfor-

mance Computing along with the rising amounts of

publicly available labeled data, Deep Neural Net-

works (DNN), as an implementation of AI, have and

will revolutionize virtually every current application

domain as well as enable novel ones like those on au-

tonomous, predictive, resilient, self-managed, adap-

tive, and evolving applications.

1.2 Problem Statement

Independant Software Vendors aim to protect both:

data and the Intellectual Property of their AI-based

software assets, deployed on potentially unsecure

edge hardware & platforms (Goodfellow, 2018).

The deployment of data processing capabilities

throughout Distributed Enterprise Systems rises sev-

eral security challenges related to the protection of in-

put & output data (Parliament and Council, 2016) as

well as of software assets.

In the speciﬁc context of distributed intelligence,

DNN based/enhanced software will represent key in-

vestments in infrastructure, skills and governance, as

well as in the acquisition of data and talents. The

software industry is therefore in the direct need to

safeguard these strategic investments by enforcing the

protection of this new form of Intellectual Property.

1.3 State-of-the-Art

Security of Deep Neural Networks is a current re-

search topic taking advantage of two major crypto-

graphic approaches: variants of Fully Homomorphic

Gomez, L., Wilhelm, M., Márquez, J. and Duverger, P.

Security for Distributed Deep Neural Networks: Towards Data Conﬁdentiality Intellectual Property Protection.

DOI: 10.5220/0007922404390447

In Proceedings of the 16th International Joint Conference on e-Business and Telecommunications (ICETE 2019), pages 439-447

ISBN: 978-989-758-378-0

439

Encryption/FHE (Gentry, 2009) and Secure Multi-

Party Computation/SMC (Cramer et al., 2015). While

FHE techniques allow addition and multiplication on

encrypted data, SMC enables arithmetic operations

on data shared across multi-parties.

Several approaches can be found in the literature,

at different phases of the development and deploy-

ment of DNNs.

Secure Training. Secure DNN training has been

addressed using FHE (Graepel et al., 2012) and SMC

(Shokri and Shmatikov, 2015), disregarding protec-

tion once the trained model is to be productively de-

ployed. Other Machine Learning models such as lin-

ear and logistic regressions have also been trained in a

secure way in (Mohassel and Zhang, 2017). In those

approaches, conﬁdentiality of training data is guaran-

teed, while runtime protection (i.e. input, model, out-

put) is out of scope.

Processing on Encrypted Data. At processing

phase, SMC has led to cooperative solutions where

several devices work together to obtain federated in-

ferences (Liu et al., 2017), not supporting deployment

of the trained DNN to trusted decentralized systems.

DNN processing on FHE encrypted data is covered

in CryptoNets (Gilad-Bachrach et al., 2016) More

recently, in (Boemer et al., 2018), the authors pro-

posed a privacy-preserving framework for deep learn-

ing, making use of the SEAL (SEAL, 2018) FHE li-

brary. While disclosure of data at runtime is prevented

in these solutions, protection of DNN models remains

out of the scope.

Intellectual Property Protection of DNN Model.

In (Uchida et al., 2017), the authors tackles IP pro-

tection of DNN models through model watermarking.

While infringement can be detected with this method,

it can not be prevented. Furthermore, runtime protec-

tion of input, model and output are out of scope.

To the best of our knowledge, no other publication

has holistically tackled the protection of both trained

DNN models and data, targeting distributed untrusted

systems.

1.4 Data & Intellectual Property

Protection for Deep Neural

Networks

In this paper we propose a novel approach

for the Intellectual Property Protection of DNN-

based/enhanced software while enabling data protec-

tion at processing time, making use of concepts such

as Fully Homomorphic Encryption (FHE).

Once trained, DNN model parameters (i.e.

weights, biases) are encrypted homomorphically. The

resulting (encrypted) DNN can be distributed across

untrusted landscapes, preserving its IP while mitigat-

ing the risk of reverse engineering. At runtime, FHE-

encrypted insights from encrypted input data are pro-

duced by the homomorphically encrypted DNN. Con-

ﬁdentiality of both trained DNN, input and output

data will be therefore guaranteed.

In this paper, we evaluate the overall performance

(e.g. CPU, memory, disk usage) along with the accu-

racy of encrypted DNNs.

This paper is organized as follows: Section 2 de-

tails the fundamentals of our approach. Section 3 pro-

vides an overview of our solution. In Sections 3.3

and 4, we present the architecture and evaluation, con-

cluding with an outlook in Section 5.

2 FUNDAMENTALS

2.1 Deep Neural Network

DNNs are composed of L transformation layers:

1. An input layer, the tensor of input data X

2. L − 1 hidden layers, mathematical computations

transforming X sequentially.

3. An output layer, the tensor of output data Y.

We denote the output of layer i as a tensor A

[i]

with A

[0]

= X,and A

[L]

= Y . Tensors can have differ-

ent sizes and number of dimensions.

Each layer A

[i]

depends on the mathematical com-

putations performed at the previous layer A

[i−1]

. At

each layer A

[i]

, two types of function can be com-

puted:

• Linear: involving polynomial operations.

• Non-linear, involving non-linear operations, so

called activation function, such as max, exp,

division, ReLU, or Sigmoid.

2.1.1 Linear Computation Layer

For the sake of clarity, we exemplify the inner linear

computation with a Fully Connected (FC) layer, as

depicted in Figure 1.

A Fully Connected layer, noted A

[i]

, is composed

of n parallel neurons, performing a R

→ R

transfor-

mation (see Figure 1). We deﬁne:

[i]

. . . a

[i]

. . . a

[i]

as the output of layer A

[i]

;

SECRYPT 2019 - 16th International Conference on Security and Cryptography

440

Figure 1: Fully Connected layer with Activation Function.

[i]

. . . z

[i]

. . . z

[i]

as the linear output of layer

[i]

; (z

[i]

= a

[i]

if there is no activation function)

[i]

. . . b

[i]

. . . b

[i]

as the bias for layer A

[i]

;

[i]

. . . w

[i]

. . . w

[i]

as the weights for layer

[i]

Neuron k performs a linear combination of the

output of the previous layer a

[i−1]

multiplied by the

weight vector w

[i]

and shifted with a bias scalar b

[i]

obtaining the linear combination z

[i]

∑

l=0

[i]

[l] ∗ a

[i−1]

+ b

[i]

= w

[i]

∗ a

[i−1]

+ b

[i]

(1)

Vectorizing the operations for all the neurons in

layer A

[i]

we obtain the dense layer transformation:

[i]

= W

[i]

∗ a

[i−1]

+ b

[i]

(2)

where W and b are the parameters for layer A

[i]

2.1.2 Activation Functions

Activation functions are the major source of non-

linearity in DNNs. They are performed element-wise

→ R

, thus easily vectorized), and are gener-

ally located after linear transformations such as Fully

Connected layers.

[i]

= f

act



[i]



(3)

Several activation functions have been proposed in

the literature but Rectiﬁed Linear Unit (ReLU) is cur-

rently considered as the most efﬁcient activation func-

tion for DL. Several variants of ReLU exist, such as

Leaky ReLU(Maas et al., 2013), ELU(Clevert et al.,

2015) or its differentiable version Softplus.

ReLU(z) = z

= max(0, z)

So f t plus(z) = log(e

+ 1)

(4)

2.2 Homomorphic Encryption

While preserving data privacy, Homomorphic En-

cryption (HE) schemes allow certain computations on

ciphertext without revealing neither its inputs nor its

internal states. Gentry (Gentry, 2009) ﬁrst proposed

a Fully Homomorphic Encryption (FHE) scheme,

which theoretically could compute any kind of arith-

metic circuit, but is computationally intractable in

practice. FHE evolved into more efﬁcient schemes

preserving addition and multiplication over encrypted

data, such as BGV (Brakerski et al., 2011), FV (Fan

and Vercauteren, 2012) or CKKS (Cheon et al., 2018),

allowing approximations of multiplicative inverse,

exponential and logistic function, or discrete Fourier

transformation. Similar to asymmetric encryption, a

public-private key pair (pub, priv) is generated.

Deﬁnition 1. An encryption scheme is called homo-

morphic over an operation  if it supports the follow-

ing

Enc

pub

(m) =

pub

, ∀m ∈ M

 m

pub



pub

, ∀m

, m

∈ M

where Enc

pub

is the encryption algorithm and M is

the set of all possible messages.

Deﬁnition 2. Decryption is performed as follows

Enc

pub

(m) =

pub

, ∀m ∈ M

Dec

priv

(

pub

) = m

where Dec

priv

is the decryption algorithm and M is

the set of all possible messages.

2.3 Challenges

Even though HE schemes seem theoretically promis-

ing, their usage comes with several drawbacks, par-

ticularly when applied to Deep Learning.

2.3.1 Noise Budget

In Gentry’s lattice-based HE schemes(Gentry, 2009)

and subsequent variants of it, ciphertexts contain a

small term of random noise drawn from some prob-

ability distribution. To estimate the current magni-

tude of noise, a noise budget can be calculated, that

starts as a positive integer, decreases with subsequent

operations and reaches 0 exactly when the ciphertext

becomes indecipherable. The noise budget is more

strongly affected by multiplications as by additions.

In order to cope with that challenge, encryption

parameters can be adjusted accordingly to the re-

quired computation depth of an arithmetic circuit.

Security for Distributed Deep Neural Networks: Towards Data Conﬁdentiality Intellectual Property Protection

441

2.3.2 FHE Libraries and APIs

Multiple FHE libraries are available (Halevi and

Shoup, 2014), (PALISADE, 2018), (SEAL, 2018),

(Ducas and Micciancio, 2015). Depending on the

supported HE schemes, those libraries show notice-

able difference on performance (e.g. computational,

memory consumption), on supported operations type

(e.g. addition, multiplication, negative, square, di-

vision), datatype (e.g. ﬂoating point, integer), and

chipset infrastructure (e.g. CPU, GPU).

In addition, and regardless on their level of matu-

rity and performance, HE libraries can be conﬁgured

through several encryption parameters such as:

• Polynomial degree or modulus: which determines

the available noise budget and strongly affects the

performance.

• Plaintext modulus: which is mostly associated to

the size of input data.

• Security parameter: which sets the reached level

of security in bits of the cryptosystem (e.g. 128,

192, 256-bit security level).

Fine-tuning of those encryption parameters en-

ables developers to optimize the performance of en-

cryption and encrypted operations. The selection of

the right encryption parameters depends on the size

of the plaintext data, targeted accuracy loss or level of

security.

2.3.3 Linear Function Support Only

By construction, linear functions, composed of ad-

dition and multiplication operations, are seamlessly

protected by FHE. But, non-linear activation func-

tions such as ReLU or Sigmoid require approximation

to be computed with FHE schemes.

The challenge lies on the transformation of acti-

vation functions into polynomial approximations sup-

ported by HE schemes. We elaborate more on approx-

imation of activation functions in Section 3.2.

2.3.4 Supported Plaintext Type

The vast majority of HE schemes allow operations

on integers (Halevi and Shoup, 2014; SEAL, 2018)

, while others use booleans (Chillotti et al., 2018) or

ﬂoating point numbers (Cheon et al., 2018; SEAL,

2018). In the case of integer supporting HE schemes,

rational numbers can be approximated using ﬁxed-

point arithmetic by scaling with a scaling factor and

rounding.

2.3.5 Performance

FHE schemes are computationally expensive and

memory consuming. In addition, ciphertexts are of-

ten signiﬁcantly bigger than plaintexts and thus use

more memory and disk space.

Even if in the past years the performance of

FHE made it impractical, recent FHE schemes show

promising throughput. New FHE libraries take also

advantage of GPU acceleration.

In addition, modern implementations of HE

schemes such as HELib (Halevi and Shoup, 2014),

SEAL (SEAL, 2018), or PALISADE (PALISADE,

2018) beneﬁt from Single Instruction Multiple Data

(SIMD), allowing multiple integers to be stored in

a single ciphertext and vectorizing operations, which

can accelerate certain applications signiﬁcantly.

3 APPROACH

As introduced in Section 1.2, the delivery of DNN-

enriched insights come at a cost. ISVs aim to guar-

antee data security, together with the IP protection of

their DNN-based software assets, deployed on poten-

tially unsecure edge hardware & platforms. In order

to achieve those security objectives on DNN, we uti-

lize FHE schemes to operate on ciphertext at runtime.

Consequently, secure training of DNN is out of

scope of our approach as we focus on runtime exe-

cution. We assume that DNN training already pre-

serves both data privacy and conﬁdentiality, and the

resulting trained model. Once a model is trained, as

discussed in Section 2.1.1, we obtain a set of param-

eters for each DNN layer; i.e weights W

[i]

and biases

[i]

. Those parameters constitute the IP to be pro-

tected when deploying a DNN to distributed systems.

3.1 Linear Computation Layer

Protection

Our approach is agnostic from the type of layer.

In (Gomez et al., 2018), we detail the encryp-

tion of layers such as Convolutional Layer or

Batch Normalization. For sake of simplicity,

we exemplify the encryption of DNN layers pa-

rameters on FC layers. Since FC are simply

a linear transformation on the previous layer’s

outputs, encryption is achieved straightforwardly as

SECRYPT 2019 - 16th International Conference on Security and Cryptography

442

follows

[i]

pub

[i]

∗ a

[i−1]

+ b

[i]

pub

[i]

pub

∗

[i−1]

pub

[i]

pub

(5)

3.2 Activation Function Protection

Due to their innate non-linearity, activation func-

tions need to be approximated with polynomials to be

encrypted with FHE. Several approaches have been

elaborated in the literature. In (Livni et al., 2014) and

(Gilad-Bachrach et al., 2016), the authors proposed to

use a square function as activation function. The last

layer, a sigmoid activation function, is only applied

during training. Chabanne et al. used Taylor polyno-

mials around x = 0, studying performance based on

the polynomial degree (Chabanne et al., 2017). In

(Hesamifard et al., 2017), Hesamifard et al. approx-

imate instead the derivative of the function and then

integrate to obtain their approximation.

Regardless on the approximation technique, we

denote activation function f

act

() approximation as

act

() ≈ f

approxact

() (6)

By construction, we have

[i]

pub

act



[i]

E

pub

≡

approxact



[i]

E

pub

(7)

3.3 Architecture

In this section we outline the architecture of our IP

protection system, as depicted in Figure 2.

Figure 2: Overall Architecture.

3.3.1 Encryption of Trained DNN

In the backend, a DNN is trained within a DNN Train-

ing Agent, 1 . The outcome of the training (NN ar-

chitecture and parameters) is pushed to the Trained

DNN Protection Agent, 2 . Alternatively, an already

trained DNN can be imported directly into the Pro-

tection Agent. The DNN Protection Agent generates

a Fully Homomorphic key pair from the Key Genera-

tor component, 3 . The DNN is then encrypted and

stored together with its homomorphic key pair in the

Trained and Protected DNN Database, 4 .

3.3.2 Deployment of Trained and Protected

DNN

At the deployment phase, the Trained DNN Deploy-

ment Agent deploys the DNN on distributed systems,

together with its public key, 5 .

3.3.3 DNN Processing

On the distributed system, data is collected by a Data

Stream Acquisition component, 6 , and forwarded to

the DNN Processing Agent, 7 . The input layer does

not involve any computation, and therefore can be

seamlessly FHE encrypted as follows:

encryption

−−−−−−→ Enc

pub

(X) =

pub

(8)

Encrypted inferences are sent to the Decryption

Agent, 8 , for their decryption using the private key

associated to the DNN, 9 . FHE encryption propa-

gates across the DNN layers, from the input to the out-

put layer. By construction, the output layer is there-

fore encrypted homomorphically.

The decryption of the last layer’s output Y is done

with the private key priv:

[L]

pub

decryption

−−−−−−→ Dec

priv



[L]

pub



= Y (9)

The Intellectual Property of the DNN, together

with the input & output results, is protected from any

disclosure on the distributed system throughout the

entire process.

4 EVALUATION

As detailed in Section 2.3, FHE introduces addi-

tional computational costs at each step of the DNN

life-cycle. In this section, we evaluate performance

overhead from computation time, memory load and

Security for Distributed Deep Neural Networks: Towards Data Conﬁdentiality Intellectual Property Protection

443

diskusage perspectives at DNN model and processing

encryption and output decryption.

4.1 Hardware Setup

As backend, we use a NVIDIA DGX-1

server, em-

powered with 8 Tesla V100 GPUs. This machine is

theoretically not resource-constrained (computation

& memory). We reasonably neglect the impact of the

performance overhead introduced by FHE on DNN

trained model encryption and output decryption.

We deploy and execute our encrypted DNN on a

NVIDIA Jetson-TX2

. Powered by NVIDIA Pascal

architecture, this platform embeds 256 CUDA cores,

CPU HMP Dual Denver 22 MB L2 + Quad ARM®

A572 MB L2, and 8 GB of memory. This platform

gets closer to the hardware conﬁguration of a Dis-

tributed Enterprise System.

4.2 Software Setup

DNN Model. As demonstrated in Section 3, our ap-

proach is fully agnostic from NN topology, or imple-

mentation. For the sake of our evaluation, involving

several modiﬁcations to the NN model, we choose a

simple CNN classiﬁer

, implemented with the Keras

library

. Two datasets have been used in our ex-

periment: CIFAR10

, for image classiﬁcation, and

MNIST

for handwritten digits classiﬁcation.

As depicted in Figure 3, we distinguish two main

parts in this CNN: a feature extractor and a classiﬁer.

The feature extractor reduces the amount of informa-

tion from the input image, into a set of high level and

more manageable features. This step facilitates the

subsequent classiﬁcation of the input data.

Composed of four layers, [FC → ReLU → FC →

Softmax], the classiﬁer categorizes the input data ac-

cording to the extracted features, and outputs discrete

probability distribution over 10 classes of objects.

As reference point, we evaluate key performance

ﬁgures at model training and processing time without

encryption. Once trained, the size of the CNN plain-

text model is 9.6Mb. On Jetson TX2, single unen-

crypted image classiﬁcation is computed on average

in 89.1ms.

https://www.nvidia.com/en-us/data-center/dgx-1/

https://www.nvidia.com/en-us/autonomous-

machines/embedded-systems-dev-kits-modules/

https://github.com/keras-

team/keras/blob/master/examples/cifar10 cnn.py

https://keras.io

https://www.cs.toronto.edu/ kriz/cifar.html

http://yann.lecun.com/exdb/mnist/

Figure 3: Keras Convolutional Neural Network.

FHE Library. As introduced in section 2.3, several

libraries are available for FHE. We use SEAL (SEAL,

2018) C library from Microsoft Research running on

CPU. This choice is motivated by the library’s per-

formance, support of multiple schemes such as BGV

(Brakerski et al., 2011), stability, and documentation.

The use of SEAL, implemented in C++, with the

Keras Python library requires some engineering ef-

forts. To enable both fast performance of the native

C++ library and rapid prototyping using Python, we

use Cython

We conduct our evaluation with the BGV scheme

(Brakerski et al., 2011), utilizing the integer encod-

ing with SIMD support. To handle the ﬂoating-point

DNN parameters, we use ﬁxed-point arithmetic with

a ﬁxed scaling factor, similarly to CryptoNets(Gilad-

Bachrach et al., 2016). This has no noticeable im-

pact on the classiﬁcation accuracy, if a suitable scal-

ing factor is applied. The SIMD operations allow for

optimized performance through vectorization.

4.3 Linearization

We tackle the problem of linearization of the ReLU

functions following approaches: we approximate it

with a modiﬁed square function, and we skip activa-

tion function. The modiﬁed square function x

+ 2x

(see Figure 5) is derived from the ReLU approxima-

tion proposed in (Chabanne et al., 2017). In order to

optimize the computation of that function on cipher-

texts, we used simpler coefﬁcients.

In order evaluate the impact of these approaches,

we trained the CNN on the CIFAR10 and MNIST

datasets, replacing the last ReLU activation. Depicted

in Figure 4, we report the accuracy loss. Both approx-

imations have merely a minor impact on the output

classiﬁcation accuracy.

https://cython.org/

SECRYPT 2019 - 16th International Conference on Security and Cryptography

444

0 20 40

80 100120140

160

180199

0.96

0.97

0.98

0.99

Epoch

Validation accuracy

Original model

No activation function

+ 2x

(a) Experiment on MNIST dataset.

0 20 40

80 100120140

160

180199

0.5

0.6

0.7

0.8

0.9

Epoch

Validation accuracy

Original model

No activation function

+ 2x

(b) Experiment on CIFAR10 dataset.

Figure 4: Classiﬁcation Accuracy with ReLU Approxima-

tion.

−1.5 −0.5 0.5 1.5

−1.5

−0.5

0.5

1.5

Square Approximation

ReLu

Figure 5: ReLU Approximation as Square Function.

Skipping the last activation function shows good

results on this simple CNN, but we do not want to

generalize to any other DNN or dataset.

4.4 Experimentation Results

4.4.1 Model & Data Protection

Intellectual Property-wise, we consider the feature ex-

tractor as of minor importance, as CNNs generally use

state of the art feature extractor. The IP of the model

rather lies in the parameters, weights and bias, of the

trained classiﬁer. For that reason, we encrypt the clas-

siﬁer only, as a ﬁrst step towards full model encryp-

tion, as depicted in Figure 3. To better understand

the impact of computation depth, we also complete

our evaluation with the encryption of the last FC layer

only.

Conﬁdentiality-wise, we evaluate the impact of

extracted features encryption by comparing process-

ing performance on an encrypted model with plaintext

and encrypted feature extractor outputs.

As depicted in Figure 3, we evaluate our approach

on three modiﬁed versions of the model:

• Last FC Layer Encrypted

• Full Classiﬁer Encrypted with no Activation

Function

• Full Classiﬁed Encrypted with our Modiﬁed

Square Activation Function

Conﬁdentiality-wise, we evaluate the impact of

extracted features encryption by comparing process-

ing performance on an encrypted model with plaintext

and encrypted feature extractor outputs.

In order to optimize our approach, we omit the

Softmax layer within the classiﬁer. This layer does

not have any inﬂuence on the classiﬁcation results,

as Softmax layer is mostly required at training phase,

to normalize network outputs probability distribution,

for more consistent loss calculations.

The overall experiment as described in section 3.3

has been applied 5 times on each model. We report

average evaluation metrics for each step: model en-

cryption, processing encryption and decryption.

4.4.2 DNN Model Encryption

Each trained CNN model is encrypted on DGX-1’s

CPU. In Table 1, we depict the resource consumption

average on the following metrics:

• Time to Compute: Time to encrypt the model.

• Model Size: Size of resulting encrypted model.

• Memory Load: Overall memory usage for model

encryption.

We target three security levels: 128, 192, and 256-

bits. For each of those, we optimize SEAL parame-

ters as introduced in section 2.3, maximizing perfor-

mance, and minimizing leftover noise budget.

Security for Distributed Deep Neural Networks: Towards Data Conﬁdentiality Intellectual Property Protection

445

Table 1: Model Encryption.

Achieved Security Level (bits)

128 192 256 128 192 256 128 192 256

Full Classiﬁer - x

+ 2x Full Classiﬁer - No Act. Last Layer

Time to Compute (s) 256.7 191.5 212.6 86.9 78.0 96.0 3.4 3.4 7.3

Model Size 11G 6.4G 11G 2.5G 4.4G 2.5G 79Mb 79Mb 158Mb

Memory (Mb) 2257.9 1112.4 2389.5 557.9 459.2 566.1 300.6 301.4 324.2

Table 2: Runtime Encryption with Plaintext Input.

Achieved Security Level (bits)

128 192 256 128 192 256 128 192 256

Full Classiﬁer - x

+ 2x Full Classiﬁer - No Act. Last Layer

Time to Compute (s) 287.2 174.6 1221.3 43.7 32.9 90.2 2.1 2.1 4.5

Memory Load (Mb) 4683.1 2924.3 4899.2 1162.5 869.2 2342.3 53.9 54.0 117.7

Remaining Noise Budget 221.6 80.2 88.8 91.4 23.8 16.2 58.8 20.2 60.2

Table 3: Runtime Encryption with Encrypted Input.

Achieved Security Level (bits)

128 192 256 128 192 256 128 192 256

Full Classiﬁer - x

+ 2x Full Classiﬁer - No Act. Last Layer

Time to Compute (s) 2902.6 1048.6 5627 367.1 272.2 835.9 19.2 19.2 40.2

Memory Load (Mb) 5047.8 5809 4906 2314.8 1733.7 4644.5 119.5 118.5 253

Remaining Noise Budget 206.0 65.0 76.0 79.0 11.0 3.0 45.0 10.0 52.0

Compared to the plaintext model size (9.6Mb), en-

crypted model size increases by a factor of 8,22 in the

best case.

4.4.3 DNN Processing Encryption

The three encrypted CNN models deployed on Jetson-

TX2 for CPU based encrypted processing. At this

stage, we evaluate the following metrics

• Time to compute: Processing time for an en-

crypted classiﬁcation.

• Memory: Memory usage for encrypted classiﬁca-

tion.

• Remaining Noise Budget: At the end of process-

ing encryption, we evaluate the remaining noise

budget, which determines if additional encryption

operations could be performed on the output vec-

tor.

In Table 2 and Table 3, we depict the performance

of encrypted processing with plaintext and encrypted

previous layer outputs. We study the impact of conﬁ-

dentality preservation of the preceding layer outputs.

SEAL library supports secure computation over plain-

text and ciphertext producing ciphertext. As a conse-

quence, output of the last MaxPooling2D layer can be

processed in FHE-encrypted Fully Connected layer.

Secure computation between plaintext and ciphertext

has a lower impact on performance.

We observe a slight performance improvement on

time to compute and memory between 128 and 192-

bit security level. This is due to the FHE parame-

ters optimization as described in Section 4.4.2, where

initial noise budget is oversized for 128-bit security

level, which has a direct impact to performance.

Experiment results show that, depending on the

level of achieved security, and targeted scenario, we

can achieve at best encrypted classiﬁcation in 2.1s (for

128 level security and only one layer encrypted).

4.4.4 Decryption

Following our approach, encrypted output are de-

crypted by the backend, on DGX-1. We therefore

consider decryption as not computationally expen-

sive, compared to encryption. Results are available

in Table 4.

5 CONCLUSION

In this paper, we discuss and evaluate a holistic ap-

proach for the protection of distributed Deep Neural

Network (DNN) enhanced software assets, i.e. conﬁ-

dentiality of their input & output data streams as well

as safeguarding their Intellectual Property. On that

matter, we take advantage of Fully Homomorphic En-

cryption (FHE). We evaluate the feasibility of this so-

SECRYPT 2019 - 16th International Conference on Security and Cryptography

446

Table 4: Decryption - Performance.

Achieved Security Level (bits)

128 192 256 128 192 256 128 192 256

Full Classiﬁer - x

+ 2x Full Classiﬁer - No Act. Last Layer

Time to Compute (s) 2.9 1.7 3.2 0.6 0.6 1.0 0.2 0.1 0.2

Memory Load (Mb) 963.8 397.4 2062.5 123.4 73.4 267.1 17.8 17.8 38.7

lution on a Convolutional Neural Network (CNN) for

image classiﬁcation.

Our evaluation on NVIDIA DGX-1 and Jetson-

TX2 shows promising results on the CNN image clas-

siﬁer. Performances vary from 2.1s for an encrypted

classiﬁcation, with only 53.9Mb consumed memory,

up to 1h33m with almost 5Gb of consumed memory.

REFERENCES

Boemer, F., Ratner, E., and Lendasse, A. (2018). Parameter-

free image segmentation with SLIC. Neurocomputing,

277:228–236.

Brakerski, Z., Gentry, C., and Vaikuntanathan, V. (2011).

Fully homomorphic encryption without bootstrap-

ping. Cryptology ePrint Archive, Report 2011/277.

Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and

Prouff, E. (2017). Privacy-preserving classiﬁcation

on deep neural network. IACR Cryptology ePrint

Archive, 2017:35.

Cheon, J. H., Han, K., Kim, A., Kim, M., and Song, Y.

(2018). Bootstrapping for approximate homomor-

phic encryption. IACR Cryptology ePrint Archive,

2018:153.

Chillotti, I., Gama, N., Georgieva, M., and Izabach

ene, M.

(2018). Tfhe: Fast fully homomorphic encryption

over the torus. Cryptology ePrint Archive, Report

2018/421. https://eprint.iacr.org/2018/421.

Clevert, D.-A., Unterthiner, T., and Hochreiter, S.

(2015). Fast and accurate deep network learning

by exponential linear units (elus). arXiv preprint

arXiv:1511.07289.

Cramer, R., Damg

ard, I. B., et al. (2015). Secure multiparty

computation. Cambridge University Press.

Ducas, L. and Micciancio, D. (2015). Fhew: bootstrapping

homomorphic encryption in less than a second. In An-

nual International Conference on the Theory and Ap-

plications of Cryptographic Techniques, pages 617–

640. Springer.

Fan, J. and Vercauteren, F. (2012). Somewhat practical fully

homomorphic encryption. Cryptology ePrint Archive,

Report 2012/144.

Gentry, C. (2009). A Fully Homomorphic Encryption

Scheme. PhD thesis, Stanford University, Stanford,

CA, USA. AAI3382729.

Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K.,

Naehrig, M., and Wernsing, J. (2016). Cryptonets:

Applying neural networks to encrypted data with high

throughput and accuracy. In International Conference

on Machine Learning, pages 201–210.

Gomez, L., Ibarrondo, A., M

arquez, J., and Duverger,

P. (2018). Intellectual property protection for dis-

tributed neural networks - towards conﬁdentiality of

data, model, and inference. In Samarati, P. and Obai-

dat, M. S., editors, Proceedings of the 15th Interna-

tional Joint Conference on e-Business and Telecom-

munications, ICETE 2018 - Volume 2: SECRYPT,

Porto, Portugal, July 26-28, 2018., pages 313–320.

SciTePress.

Goodfellow, I. (2018). Security and privacy of machine

learning. RSA Conference.

Graepel, T., Lauter, K., and Naehrig, M. (2012). Ml con-

ﬁdential: Machine learning on encrypted data. In In-

ternational Conference on Information Security and

Cryptology, pages 1–21. Springer.

Halevi, S. and Shoup, V. (2014). Algorithms in helib. In

International cryptology conference, pages 554–571.

Springer.

Hesamifard, E., Takabi, H., and Ghasemi, M. (2017).

Cryptodl: Deep neural networks over encrypted data.

CoRR, abs/1711.05189.

Liu, J., Juuti, M., Lu, Y., and Asokan, N. (2017). Oblivi-

ous neural network predictions via minionn transfor-

mations. In Proceedings of the 2017 ACM SIGSAC

Conference on Computer and Communications Secu-

rity, pages 619–631. ACM.

Livni, R., Shalev-Shwartz, S., and Shamir, O. (2014). On

the computational efﬁciency of training neural net-

works. In Ghahramani, Z., Welling, M., Cortes, C.,

Lawrence, N. D., and Weinberger, K. Q., editors, Ad-

vances in Neural Information Processing Systems 27,

pages 855–863. Curran Associates, Inc.

Maas, A. L., Hannun, A. Y., and Ng, A. Y. (2013). Rec-

tiﬁer nonlinearities improve neural network acoustic

models. In Proc. icml, volume 30, page 3.

Mohassel, P. and Zhang, Y. (2017). Secureml: A system

for scalable privacy-preserving machine learning. In

Security and Privacy (SP), 2017 IEEE Symposium on,

pages 19–38. IEEE.

PALISADE (2018). The palisade lattice cryptography li-

brary.

Parliament, E. and Council (2016). General data protection

regulation.

SEAL (2018). Simple Encrypted Arithmetic Library (re-

lease 3.1.0). https://github.com/Microsoft/SEAL. Mi-

crosoft Research, Redmond, WA.

Shokri, R. and Shmatikov, V. (2015). Privacy-preserving

deep learning. In Proceedings of the 22nd ACM

SIGSAC conference on computer and communications

security, pages 1310–1321. ACM.

Uchida, Y., Nagai, Y., Sakazawa, S., and Satoh, S. (2017).

Embedding watermarks into deep neural networks.

In Proceedings of the 2017 ACM on International

Conference on Multimedia Retrieval, pages 269–277.

ACM.

Security for Distributed Deep Neural Networks: Towards Data Conﬁdentiality Intellectual Property Protection

447