Intellectual Property Protection for Distributed Neural Networks

Towards Conﬁdentiality of Data, Model, and Inference

Laurent Gomez

, Alberto Ibarrondo

, José Márquez

and Patrick Duverger

SAP Security Research, 805, Avenue Dr. Maurice Donat, 06250 Sophia-Antipolis, France

City of Antibes Juan-les-Pins, France

Keywords:

Intellectual Property Protection, Fully Homomorphic Encryption, Neural Networks, Distributed Landscapes,

Smart Cities.

Abstract:

Capitalizing on recent advances on HPC, GPUs, GPGPUs along with the rising amounts of publicly available

labeled data; (Deep) Neural Networks (NN) have and will revolutionize virtually every current application

domain as well as enable novel ones such as those on recognition, autonomous, predictive, resilient, self-

managed, adaptive, and evolving applications. Nevertheless, it is to point out that NN training is rather re-

source intensive in data, time and energy; turning the resulting trained models into valuable assets representing

an Intellectual Property (IP) imperatively worth of being protected. Furthermore, in the wake of Edge compu-

ting, NNs are being progressively deployed across decentralized landscapes; as a consequence, IP owners take

very seriously the protection of their NN based software products. In this paper we propose to leverage Fully

Homomorphic Encryption (FHE) to protect simultaneously the IP of trained NN based software, as well as the

input data and inferences. Within the context of a smart city scenario, we outline our NN model-agnostic ap-

proach, approximating and decomposing the NN operations into linearized transformations while employing

a Single Instruction Multiple Data (SIMD) for vectorizing operations.

NOMENCLATURE

v,v,V Scalar, Vector, Matrix/Tensor

pub

Tensor t encrypted with key pub

1 INTRODUCTION

1.1 Motivation

Mimicking human’s cortex, Neural Networks (NN)

enable computers to learn through training. With the

recent progress on GPU based computing capabili-

ties, NN have received major improvements such as

Convolutional Layers (Krizhevsky et al., 2012), Batch

Normalization (Ioffe and Szegedy, 2015) or Residual

Blocks (He et al., 2016). As part of the Deep Learning

(DL) (Goodfellow et al., 2016) ﬁeld, DNN have revo-

lutionized the creation of software based applications

for problems with a non-deterministic solution space

(e.g. object detection, facial recognition, autonomous

driving, video processing, among others).

But GPUs hardware, and labeled data sets come

at a cost. In addition, NN training is data, time and

energy-intensive. This makes the outcome of DL trai-

ning very valuable: the topology, the number and type

of hidden layers including design characteristics (de-

ﬁned before training); and specially the model, the

values of all the parameters in the trained network.

Furthermore, with the rise of edge computing and

Internet of Things (IoT), NN are meant to be deployed

outside of corporate boundaries, closer to customer

business and in potentially insecure environments. In-

dustrial actors take very seriously the Intellectual Pro-

perty (IP) protection of trained DNN. This new para-

digm calls for solutions to protect IP of distributed DL

inference processing systems, with DNN deployment

and execution on decentralized systems.

The lack of solutions for IP protection exposes

trained NN owners to reverse engineering on their DL

models (Tramèr et al., 2016). As outlined in (Augasta

and Kathirvalavakumar, 2012) (Floares, 2008), attac-

kers can steal trained NN models. In such new co-

ding paradigm, where design patterns are enforced in

known and legacy implementations, the question of

IP is at stake. The question is not so much how to

protect the DNN architecture (since most architectu-

res are grounded on well known research), but rather

how to protect the trained DNN model.

Gomez, L., Ibarrondo, A., Márquez, J. and Duverger, P.

Intellectual Property Protection for Distributed Neural Networks - Towards Conﬁdentiality of Data, Model, and Inference.

DOI: 10.5220/0006854701470154

In Proceedings of the 15th International Joint Conference on e-Business and Telecommunications (ICETE 2018) - Volume 2: SECRYPT, pages 147-154

ISBN: 978-989-758-319-3

147

Figure 1: Diagram of IP protection solution.

1.2 State of the Art

Applying security to (Deep) Neural Networks is a

current research topic sought using mainly two diffe-

rent techniques: variants of Fully Homomorphic En-

cryption/FHE (Gentry, 2009) and Secure Mupti-party

Computation/SMC (Cramer et al., 2015). While FHE

techniques allow encrypted addition and multiplica-

tion in a single machine, SMC employs gated circuits

to perform arithmetic operations on shared data across

several communicating machines. With these techni-

ques at hand, NN protection is pursued for two main

phases: training and classiﬁcation/inference.

Secure NN training has been tackled using FHE

(Graepel et al., 2012) and SMC (Shokri and Shmati-

kov, 2015), disregarding protection once the trained

model is to be deployed and used. Other Machine Le-

arning models such as linear and logistic regressions

have also been trained in a secure way in (Mohassel

and Zhang, 2017).

Regarding classiﬁcation, SMC has led to coopera-

tive solutions where several devices work together to

obtain federated inferences (Liu et al., 2017), not sup-

porting deployment of the trained NN to trusted de-

centralized systems. Inference using FHE encrypted

data was covered in cryptonets (Gilad-Bachrach et al.,

2016), improved in (Chabanne et al., 2017) and (He-

samifard et al., 2017). While preventing disclosure

of data at inference phase, the security of the model

itself is out of their scope.

So far, the only research addressing IP protection

of NNs used watermarking (Uchida et al., 2017).

Even though this technique can detect infringement,

it cannot be prevented, thus failing to preserve conﬁ-

dentiality neither on the input data, inference nor on

the NN model.

Regarding IP protection of the NN, the problem

has been only addressed using watermarking (Uchida

et al., 2017). In this case, even though infringement

can be detected, without preventing it, no conﬁdenti-

ality preserving solution is elaborated neither on the

input data, inference nor on the NN model.

To the best of our knowledge, no other publication

has tackled protection of both trained NN models and

data on decentralized and untrusted systems.

1.3 Proposed Solution

In this paper we propose a solution to protect both

the IP of trained NN, input data and output infe-

rence, leveraging on FHE. Once trained, the parame-

ters of the trained NN model are encrypted homomor-

phically. The resulting encrypted NN can be deployed

on potentially insecure decentralized systems, while

preserving the trained NN model and mitigating risk

of reverse engineering. Inference can still be carried

out over the homomorphically encrypted DNN, inser-

ting homomorphically encrypted data and producing

homomorphically encrypted predictions. Conﬁdenti-

ality of both trained NN, input data and inference re-

sults are therefore guaranteed.

This paper is organized as follows: section 2 pro-

vides an overview of our solution and the use case.

Section 3 details the fundamentals of our approach.

In section 4, we present the architecture and proces-

ses, concluding with future steps in section 5.

2 NEURAL NETWORK

INTELLECTUAL PROPERTY

PROTECTION SYSTEM

2.1 Overview

Our system is structured in 4 blocks (Figure 1):

1. NN Training: during this phase, unencrypted

data is used to train the NN. Alternatively, we can

import an already trained NN.

2. Encryption of Trained NN: once trained, the NN

is protected, encrypting all parameters comprised

in the model.

SECRYPT 2018 - International Conference on Security and Cryptography

148

3. Inference on Decentralized Systems: the en-

crypted NN can be deployed on decentralized sy-

stems for DL inference, protecting its IP .

4. Inference Decryption: an encrypted NN produ-

ces encrypted inference, to be decrypted only by

the owner of the trained NN.

2.2 Use Case

In this paper, we illustrate our approach with video

surveillance use case for risk prevention in pu-

blic spaces. Nowadays, cities are equipped with vi-

deo surveillance infrastructure, where video stream

is manually monitored and analyzed by police ofﬁ-

cers. This is time-consuming, costly and with que-

stionable efﬁciency, thus cameras end up being used

a posteriori to review incident. Indeed, smart cities

rely on video-protection infrastructure to improve se-

cure early detection of incidents in public spaces (e.g.,

early detection of terrorist attacks, abnormal crowd

movement). By empowering cameras with deep le-

arning capabilities on the edge, cameras evolve into

multi-function sensors. Pushing the computation to

where the data is being obtained substantially reduces

communication overhead. This way, cameras can pro-

vide analytics and feedback, shifting towards a smart

city cockpit.

With such approach, video management shifts

from sole protection to versatile monitoring. These

cameras has not only a simple - but essential - secu-

rity role. It can also measure in real time the pulse of

the agglomeration throughout vehicle ﬂows and pe-

ople who use them to redeﬁne mobility, reduce public

lighting costs, smooth trafﬁc ﬂow, etc.

3 FUNDAMENTALS OF IP

PROTECTION

3.1 Homomorphic Encryption

While preserving data privacy, Homomorphic En-

cryption (HE) schemes allow certain computations

on encrypted data without revealing neither its in-

puts nor its internal states. (Gentry, 2009) ﬁrst

proposed a Fully Homomorphic Encryption (FHE)

scheme, which theoretically could compute any kind

of function, but it was computationally intractable.

FHE evolved into more efﬁcient techniques

like Somewhat/Leveled Homomorphic Encryption

SHE/LHE, which preserve both addition and multi-

plication over encrypted data. Similar to asymmetric

encryption, during KeyGen a public key (pub) is ge-

nerated for encryption, and a private key (priv), for

decryption. Encrypted operations hold:

Enc

pub

(a ∗x + b) ≡

a ∗x + b

pub

a ∗x

pub

∗

pub

(1)

Modern implementations such as HELib (Halevi and

Shoup, 2014) or SEAL (Laine and Player, 2016) in-

clude Single Instruction Multiple Data (SIMD), allo-

wing multiple data to be stored in a single ciphertext

and vectorizing operations. Hence, FHE protection

implies vectorized additions and multiplications.

3.2 Data Encryption

The data encryption mechanism depends on the cho-

sen scheme, the most efﬁcient being BGV (Brakerski

et al., 2011) and FV (Fan and Vercauteren, 2012). The

encryption process is computationally slow, hence it

can generate a bottleneck for the whole system, ha-

ving a negative impact on overall performance.

encryption

−−−−−−→ ENC

pub

(X) =

pub

(2)

3.3 Protecting Deep Neural Networks

Multiple architectures of deep neural networks have

been designed addressing various domains. Our ap-

proach for IP protection is agnostic about the archi-

tecture of the Deep Neural Network. Nonetheless,

in this paper we employ Deep Convolutional Neural

Networks (DCNN), appropriate for video processing.

A DNN with L layers is composed of:

1. An input layer, the tensor of input data X

2. L −1 hidden layers, mathematical computations

transforming X somewhat sequentially.

3. An output layer, the tensor of output data Y.

We denote the output of layer i as a tensor A

[i]

with A

[0]

= X,and A

[L]

= Y . Tensors can have diffe-

rent sizes and even different number of dimensions.

Layers inside a NN can be categorized as:

• Linear: they only involve polynomial operati-

ons, and can be seamlessly protected using FHE,

such as Fully Connected layer (FC), Convolutio-

nal layer (Conv), residual blocks, and mean pool-

ing.

• Non-linear, they include other operations (max,

exp, division), and must be converted into sums

and multiplications. E.g.: Activation Functions,

Batch Normalization, max pooling...

Intellectual Property Protection for Distributed Neural Networks - Towards Conﬁdentiality of Data, Model, and Inference

149

Figure 2: Example of architecture in a Deep Convolutional Neural Network.

Selecting a DNN architecture involves choosing:

the number, types, order and size of the layers. An

example of DCNN architecture is shown in Figure 2:

[Conv → Pool]

→ [FC]

Generally, DNN are designed mimicking well known

architectures such as LeNet (LeCun et al., 2015),

VGGNet(Simonyan and Zisserman, 2014) or ResNet

(He et al., 2016), de-facto standards for object recog-

nition and image classiﬁcation.

In pursuance of full protection fors any given

DNN, each layer needs to protect its underlying ope-

rations.

3.3.1 Fully Connected Layer (FC)

Also known as Dense Layer, it is composed of N pa-

rallel neurons, performing a R

→ R

transformation

(Figure 3). We will deﬁne:

[i]

...a

[i]

...a

[i]

as the output of layer i;

[i]

...z

[i]

...z

[i]

as the linear output of layer i;

[i]

= a

[i]

if there is no activation function)

[i]

...b

[i]

...b

[i]

as the bias of layer i;

[i]

...w

[i]

...w

[i]

as the weights of layer i.

Neuron k performs a linear combination of the

output of the previous layer a

[i−1]

multiplied by the

weight vector w

[i]

and shifted with a bias scalar b

[i]

obtaining the linear combination z

[i]

∑

l=0

[i]

[l] ∗a

[i−1]

+ b

[i]

= w

[i]

∗a

[i−1]

+ b

[i]

(3)

Vectorizing the operations for all the neurons in layer

i we obtain the dense layer transformation:

[i]

= W

[i]

∗a

[i−1]

+ b

[i]

(4)

Figure 3: FC with activation for neuron k.

Protecting FC Layer. Since FC is a linear layer,

it can be directly computed in the encrypted domain

using additions and multiplications. Vectorization is

achieved straightforwardly:

[i]

pub

≡

[i]

∗a

[i−1]

+ b

[i]

pub

[i]

pub

∗

[i−1]

pub

[i]

pub

(5)

3.3.2 Activation Function

Activation functions are the major source of non-

linearity in DNNs. They are performed element-wise

→R

, thus easily vectorized), and generally loca-

ted after linear transformations (FC, Conv). All acti-

vation functions are positive monotonic.

[i]

= f

act



[i]



(6)

• Rectiﬁer Linear Unit (ReLU) is currently consi-

dered as the most efﬁcient activation function for

DL. Several variants have been proposed, such

as Leaky ReLU(Maas et al., 2013), ELU(Clevert

SECRYPT 2018 - International Conference on Security and Cryptography

150

Figure 4: Conv layer with activation for map k.

et al., 2015) or its differentiable version Softplus.

ReLU (z) = z

= max(0,z)

So f t plus(z) = log(e

+ 1)

(7)

• Sigmoid σ The classical activation function. Its

efﬁciency has been debated in the DL community.

Sigmoid(z) = σ(z) =

1 + e

−z

(8)

• Hyperbolic Tangent (tanh) is currently being used

in the industry because it is easier to train than

ReLU: it avoids having any inactive neurons and

it keeps the sign of the input.

tanh(z) =

−e

−z

+ e

−z

(9)

Protecting Activation Functions. Due to its in-

nate non-linearity, they need to be approximated with

polynomials. (Gilad-Bachrach et al., 2016) propo-

sed using only σ(z) approximating it with a square

function. (Chabanne et al., 2017) used Taylor poly-

nomials around x = 0, studying performance based on

the polynomial degree. (Hesamifard et al., 2017) ap-

proximate instead the derivative of the function and

then integrate to obtain their approximation. One al-

ternative would be to use Chebyshev polynomials.

3.3.3 Convolutional Layer (Conv)

Conv layers constitute a key improvement for image

recognition and classiﬁcation using NNs. The R

2|3

→

2|3

linear transformation involved is spatial convo-

lution, where a 2D s ∗s ﬁlter (a.k.a. kernel) is multi-

plied to the 2D input image in subsets (patches) with

size s ∗s and in deﬁned steps (strides), then added up

and then shifted by a bias (see Figure 4). For input

data with several channels or maps (e.g.: RGB counts

as 3 channels), the ﬁlter is applied to the same patch

of each map and then added up into a single value of

the output image (cumulative sum across maps). A

map in Conv layers is the equivalent of a neuron in

FC layers. We deﬁne:

[i]

as the map k of layer i;

[i]

as the linear output of map k of layer i;

[i]

= A

[i]

in absence of activation function)

[i]

as the bias value for map k in layer i

[i]

as the s ∗s ﬁlter/kernel for map k.

This operation can be vectorized by smartly repli-

cating data (Ren and Xu, 2015). The linear transfor-

mation can be expressed as:

[i]

M maps

∑

m=0

[i−1]

⊕W

[i]

+ b

[i]

(10)

Protecting Convolutional Layers. Convolution

operation can be decomposed in a series of vectorized

sums and multiplications over patches of size s ∗s. :

[i]

pub

M maps

∑

m=0

[i−1]

⊕W

[i]

+ b

[i]

pub

M maps

∑

m=0

[i−1]

⊕W

[i]

pub

[i]

pub

(

∑

m=0

[i−1]

[ j]

pub

∗

[i]

pub

)

[s∗s]

[i]

pub

(11)

3.3.4 Pooling Layer

This layer reduces the input size by using a packing

function. Most commonly used functions are max

and mean. Similarly to convolutional layers, pooling

layers apply their packing function to patches (sub-

sets) of the image with size s ∗s at strides(steps) of a

deﬁned number of pixels, as depicted in Figure 5.

Intellectual Property Protection for Distributed Neural Networks - Towards Conﬁdentiality of Data, Model, and Inference

151

Figure 5: Max and Mean packing for Pooling layers.

Protecting Pooling Layer. Max can be approxima-

ted by the sum of all the values in each patch of

size s ∗s, which is equivalent to scaled mean pooling.

Mean pooling can be scaled (sum of values) or stan-

dard (multiplying by 1/N). By employing a ﬂattened

input, pooling becomes easily vectorized.

3.3.5 Other Techniques

• Batch Normalization (BN) reduces of the range

of input values by ’normalizing’ across data bat-

ches: subtracting mean and dividing by standard

deviation. BN also allows ﬁner tuning using trai-

ned parameters β and γ (ε is a small constant used

for numerical stability).

[i+1]

= BN

γ,β

[i]

) = γ∗

[i]

−E[a

[i]

]

Var[a

[i]

] + ε

+β (12)

Protection of BN is achieved by treating division

as the inverse of a multiplication.

[i+1]

pub

∗



[i]

pub

−

E[a

[i]

]

pub



∗

Var[a

[i]

] + ε

pub

(13)

• Dropout and Data Augmentation only affect

training procedure. They don’t require protection.

• Residual Block is an aggregation of layers where

the input is added unaltered at the end of the

block, thus allowing the layers to learn incremen-

tal (’residual’) modiﬁcations (Figure 6).

[i]

= A

[i−1]

+ ResBlock



[i−1]



(14)

Figure 6: Example of a possible Residual Block.

Protection of ResBlock is achieved by protecting

the sum and the layers inside ResBlock:

[i]

pub

[i−1]

pub

ResBlock



[i−1]

E

pub

(15)

3.4 Model Training and Outcome

Training is data and computationally intensive, per-

formed by means of a backpropagation algorithm to

gradually optimize the network loss function. It is

also possible to reuse a previously trained model and

apply ﬁne tuning. As a result you get a trained model:

• Weights W and biases b in FC and Conv layers.

• E[A],

√

Var[A]

, β and γ parameters in BN.

Those constitute the secrets to be kept when de-

ploying a NN to decentralized systems. We focus so-

lely on protecting IP of the model, leaving protection

of the architecture out of the scope of this paper.

3.5 Inference Decryption

The decryption of the last layer’s output Y is simply

performed utilizing the private encryption key, as in

standard asymmetric encryption schemes:

[L]

pub

decryption

−−−−−−→ DEC

priv



[L]

pub



= Y

(16)

4 ARCHITECTURE

In this section we outline the architecture and infor-

mation ﬂows in our IP protection system, whose de-

composition can be seen in Figure 7.

Figure 7: Activity Diagram in our solution.

SECRYPT 2018 - International Conference on Security and Cryptography

152

Encryption of Trained NN 1 2 3 4

In the backend, a NN is trained within a NN Training

Agent. The outcome of the training (NN architecture

and parameters) is pushed to the Trained NN Pro-

tection Agent. Alternatively, an already trained NN

can be imported directly into the Protection Agent.

The NN Protection Agent generates a Fully Ho-

momorphic key pair from the Key Generator compo-

nent. The DNN is then encrypted and stored toget-

her with its homomorphic key pair in the Trained and

Protected NN Database.

Deployment of Trained and Protected NN 5

At the deployment phase, the Trained NN Deployment

Agent deploys the NN on decentralized systems, to-

gether with its public key.

NN Inference Processing 6 7 8 9

On the decentralized system, data is collected by a

Data Stream Acquisition component, and forwarded

to the NN Inference Agent. Encrypted inferences are

sent to the Inference Decryption Agent for their de-

cryption using the private key associated to the NN.

IP of the NN, together with the computed inferen-

ces, is protected from any disclosure on the decentra-

lized system throughout the entire process.

4.1 Sequential Processes

4.1.1 Encryption of Trained NN

Once a Neural Network is trained or imported, we en-

crypt all its parameters, using the Protected NN Da-

taBase to store it and handle Homomorphic Keys (Fi-

gure 8).

Figure 8: Sequence diagram of Trained NN Encryption.

4.1.2 Deploy Trained and Protected NN

The newly trained and protected deep neural network

is deployed on the decentralized systems, including:

1. Network architecture;

2. Network model: Encrypted parameters;

3. Public encryption key.

4.1.3 Encrypted Inference

On the decentralized system, data is collected and in-

jected into the deployed NN. We must encrypt A

[0]

X with the public encryption key associated to the de-

ployed NN (Figure 9).

Figure 9: Sequence diagram of inference processing.

4.1.4 Inference Decryption

Encrypted inferences are sent to backend, together

with an identiﬁer of the NN used for the inference.

The inference is homomorphically decrypted using

the mapping private decryption key (Figure 10).

Figure 10: Sequence diagram of inference decryption.

5 CONCLUSION

This paper elaborates on a solution for the protection

of Intellectual Property of decentralized Deep Neu-

ral Networks. Leveraging on Fully Homomorphic

Encryption, we encrypt trained DNN, while preser-

ving the conﬁdentiality of input data and resulting

inferences. This approach requires the modiﬁcation

of DNN to use linear approximation of activations

functions, together with the decomposition of all ope-

rations into sums and multiplications, and encryption

of input data at inference phase.

As future work, we will evaluate our approach on

a real smart city use case. Overall performance of the

system will be studied and compared with its unen-

crypted version. In that context, we consider the im-

pact of all operations performed on the backend as

Intellectual Property Protection for Distributed Neural Networks - Towards Conﬁdentiality of Data, Model, and Inference

153

negligible, including encryption of DNN, or decryp-

tion of inferences. Nevertheless, considering the re-

source restriction on decentralized systems, encryp-

tion of input data as well as encrypted computations

are expected to have a major impact on the perfor-

mance of the overall system.

REFERENCES

Augasta, M. G. and Kathirvalavakumar, T. (2012). Reverse

engineering the neural networks for rule extraction

in classiﬁcation problems. Neural processing letters,

35(2):131–150.

Brakerski, Z., Gentry, C., and Vaikuntanathan, V. (2011).

Fully homomorphic encryption without bootstrap-

ping. Cryptology ePrint Archive, Report 2011/277.

Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and

Prouff, E. (2017). Privacy-preserving classiﬁcation

on deep neural network. IACR Cryptology ePrint Ar-

chive, 2017:35.

Clevert, D.-A., Unterthiner, T., and Hochreiter, S.

(2015). Fast and accurate deep network learning

by exponential linear units (elus). arXiv preprint

arXiv:1511.07289.

Cramer, R., Damgård, I. B., et al. (2015). Secure multiparty

computation. Cambridge University Press.

Fan, J. and Vercauteren, F. (2012). Somewhat practical fully

homomorphic encryption. Cryptology ePrint Archive,

Report 2012/144.

Floares, A. G. (2008). A reverse engineering algorithm

for neural networks, applied to the subthalamopalli-

dal network of basal ganglia. Neural Networks, 21(2-

3):379–386.

Gentry, C. (2009). A fully homomorphic encryption scheme.

Stanford University.

Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Na-

ehrig, M., and Wernsing, J. (2016). Cryptonets: Ap-

plying neural networks to encrypted data with high

throughput and accuracy. In International Conference

on Machine Learning, pages 201–210.

Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y.

(2016). Deep learning, volume 1. MIT press Cam-

bridge.

Graepel, T., Lauter, K., and Naehrig, M. (2012). Ml con-

ﬁdential: Machine learning on encrypted data. In In-

ternational Conference on Information Security and

Cryptology, pages 1–21. Springer.

Halevi, S. and Shoup, V. (2014). Algorithms in helib. In

International cryptology conference, pages 554–571.

Springer.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resi-

dual learning for image recognition. In Proceedings of

the IEEE conference on computer vision and pattern

recognition, pages 770–778.

Hesamifard, E., Takabi, H., and Ghasemi, M. (2017).

Cryptodl: Deep neural networks over encrypted data.

CoRR, abs/1711.05189.

Ioffe, S. and Szegedy, C. (2015). Batch normalization:

Accelerating deep network training by reducing inter-

nal covariate shift. In International conference on ma-

chine learning, pages 448–456.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).

Imagenet classiﬁcation with deep convolutional neu-

ral networks. In Advances in neural information pro-

cessing systems, pages 1097–1105.

Laine, K. and Player, R. (2016). Simple encrypted arithme-

tic library-seal (v2. 0). Technical report, Technical re-

port, September.

LeCun, Y. et al. (2015). Lenet-5, convolutional neural

networks. URL: http://yann. lecun. com/exdb/lenet,

page 20.

Liu, J., Juuti, M., Lu, Y., and Asokan, N. (2017). Oblivious

neural network predictions via minionn transformati-

ons. In Proceedings of the 2017 ACM SIGSAC Con-

ference on Computer and Communications Security,

pages 619–631. ACM.

Maas, A. L., Hannun, A. Y., and Ng, A. Y. (2013). Rectiﬁer

nonlinearities improve neural network acoustic mo-

dels. In Proc. icml, volume 30, page 3.

Mohassel, P. and Zhang, Y. (2017). Secureml: A system

for scalable privacy-preserving machine learning. In

Security and Privacy (SP), 2017 IEEE Symposium on,

pages 19–38. IEEE.

Ren, J. S. and Xu, L. (2015). On vectorization of deep con-

volutional neural networks for vision tasks. In AAAI,

pages 1840–1846.

Shokri, R. and Shmatikov, V. (2015). Privacy-preserving

deep learning. In Proceedings of the 22nd ACM SIG-

SAC conference on computer and communications se-

curity, pages 1310–1321. ACM.

Simonyan, K. and Zisserman, A. (2014). Very deep con-

volutional networks for large-scale image recognition.

arXiv preprint arXiv:1409.1556.

Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., and Risten-

part, T. (2016). Stealing machine learning models via

prediction apis. In USENIX Security Symposium, pa-

ges 601–618.

Uchida, Y., Nagai, Y., Sakazawa, S., and Satoh, S. (2017).

Embedding watermarks into deep neural networks. In

Proceedings of the 2017 ACM on International Confe-

rence on Multimedia Retrieval, pages 269–277. ACM.

SECRYPT 2018 - International Conference on Security and Cryptography

154