Security for Distributed Machine Learning

Laurent Gomez

, Tianchi Yu

and Patrick Duverger

SAP Security Research, SAP Labs France, Mougins, France

City of Antibes, France

Keywords:

Machine Learning, Edge Computing, Intellectual Property, Data Privacy, Privacy Enhancing Technology,

Trusted Execution Environment.

Abstract:

With the adoption of IoT-like technologies, industrials aim to enhance the business value of their physical

assets and improve their operational efﬁciency. However, IoT devices alone tend to strain enterprise systems

with a sheer volume of unstructured and unﬁltered data. To overcome this challenge, endowing (smart) devices

with AI-based capabilities can signiﬁcantly enhance enterprise system capabilities. However, deploying AI-

based capabilities on potentially insecure edge hardware and platforms introduces new security risks, including

AI model theft, poisoning, and data leaks. This paradigm shift necessitates the protection of distributed AI

applications and data. In this paper, we propose a solution for safeguarding the Intellectual Property and data

privacy of ML-based software. We utilize hardware-assisted Privacy Enhancing Technologies, speciﬁcally

Trusted Execution Environments. We evaluate the effectiveness of our approach in the context of ML-based

motion detection in CCTV cameras. This work is part of a co-innovation project with the Smart City of

Antibes, France.

1 MOTIVATION

1.1 Context

The increasing interconnection of physical objects

has given rise to trends such as Industrial IoT, In-

dustry 4.0, and Edge Computing. These initiatives

aim to maximize business value and enhance opera-

tional efﬁciency by converging Information and Op-

erational Technology (IT/OT). However, IoT technol-

ogy alone lacks intelligence and ﬂoods centralized

Enterprise Systems with unstructured data. Combin-

ing IoT with AI-based capabilities enables distributed

decision-making, reducing latency and costs while

improving insights. This has numerous applications

in predictive maintenance, trafﬁc management, and

production optimization. As IoT technologies are

adopted by Enterprise Systems, new security chal-

lenges arise. Deploying AI on potentially un-trusted

edge hardware and platforms exposes ML models’

conﬁdentiality and data privacy to attacks.

1.2 Problem Statement

The Machine Learning Development Life-Cycle

(MLDLC) consists of three phases: Data Engineer-

ing, AI-based Software Engineering, and AI-based

Software Deployment and Execution. This paper fo-

cuses on security concerns at the Deployment and

Execution phase, speciﬁcally data privacy leaks and

model theft. Attacks such as membership inference,

property inference, or model update attacks put at risk

privacy of ML data, while model reconstruction and

model extraction attacks aim to steal AI models and

intellectual property. Edge computing further intensi-

ﬁes these security challenges due to resource limita-

tions, platform diversity, and increases attack surface

on Enterprise Systems.

1.3 Our Approach

This paper proposes a solution for secure deploy-

ment and execution of AI-based software on edge

devices. We target Intellectual Property (IP) protec-

tion and data privacy by leveraging Trusted Execu-

tion Environments (TEEs). TEEs provide isolated

and secure execution environments - enclaves -, safe-

guarding code and memory against unauthorized ac-

cess or modiﬁcation. Local and remote attestation

further validate the integrity of enclaves instantiated

on trusted platforms. The protocol involves deploy-

ing AI-based software in a secure enclave on an edge

838

Gomez, L., Yu, T. and Duverger, P.

Security for Distributed Machine Learning.

DOI: 10.5220/0012137700003555

In Proceedings of the 20th International Conference on Security and Cryptography (SECRYPT 2023), pages 838-843

ISBN: 978-989-758-666-8; ISSN: 2184-7711

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

device by the model owner (MO). The query data

owner (QDO) executes the model using her query in-

put data. The encryption material for query data is

protected within the secure enclave and shared with

the AI-based software enclave for model evaluation.

1.4 Paper Organization

The remainder of this paper is organized as follows.

Section 2 explores the state of the art on security

for machine learning at the deployment and execu-

tion phase. Section 3 elaborates on our approach,

outlining the use of Trusted Execution Environments

(TEE). Section 4 evaluates our approach’s perfor-

mance and feasibility through a smart city scenario

for risk prevention in public spaces. We conclude and

discuss future research directions in Section 5.

2 STATE OF THE ART

Numerous approaches exist for securing ML models

and associated data during deployment and execution.

These approaches can be categorized as crypto-based

PETs and hardware-assisted PETs.

2.1 Crypto-Based PETs

2.1.1 Fully Homomorphic Encryption (FHE)

FHE schemes (Brakerski et al., 2012; Fan and Ver-

cauteren, 2012) enable computations on encrypted

data without revealing the underlying data. Cryp-

toNets (Dowlin et al., 2016) was an early FHE-

based approach for Neural Network inference. How-

ever, FHE incurs high latency and resource overhead.

Recent advancements by Chabanne et al.(Chabanne

et al., 2017) and Juvekar et al.(Juvekar et al., 2018)

have improved performance and accuracy in cryp-

tographic schemes. While crypto-based PETs offer

strong security guarantees, they introduce resource

overhead and may require model modiﬁcation and re-

training. Moreover, their computational expressive-

ness is limited.

2.2 Hardware-Assisted PETs

Trusted Execution Environments (TEE) like Intel

SGX, ARM TrustZone, and AWS Nitro provide se-

cure enclaves for isolated code execution. TEE-

based approaches have been proposed, such as

SLALOM (Tram

er and Boneh, 2018) and CHE-

MIX (Gupta and Raskar, 2018). However, exist-

ing hardware-assisted PETs primarily focus on in-

put data privacy and do not address model protection.

Key management for encryption is also left open. In

summary, securing ML models at deployment and

execution involves a trade-off between security and

resource consumption. Crypto-based PETs provide

strong security but introduce major overhead, while

hardware-assisted PETs offer hardware-based secu-

rity with limitations in model protection.

3 SECURITY FOR DISTRIBUTED

MACHINE LEARNING

We present a solution to protect the Intellectual Prop-

erty (IP) of distributed AI-based software and en-

sure privacy of associated data. Our approach lever-

ages hardware-assisted Privacy Enhancing Technolo-

gies (PETs), speciﬁcally Trusted Execution Environ-

ments (TEEs). We extend the SLALOM approach in-

troduced in Section 2, enabling privacy of query input

data and secure inference within secure enclaves.

3.1 SLALOM Protocol Limitations

SLALOM uses a ”slicing technique” to divide the

neural network into smaller sub-networks for parallel

execution. Linear layers are delegated to a co-located

GPU, while non-linear layers are executed within the

TEE. However, SLALOM lacks means to safeguard

the IP of ML-based software, as model parameters are

transferred in plain text to untrusted co-located GPU

for acceleration.

3.2 Overall Protocol

Depicted in Figure 1, our protocol involves two ac-

tors: the ML model owner (MO) and the query data

owner (QDO). The AI-based software is sealed in a

secure enclave by the MO and securely shared with

the edge device TEE through remote attestation.

Prior to any AI-based model evaluation, QDO

self-generates encryption keys, embedded within a

key enclave, ans deployed along remote attestation on

the edge device. When the QDO queries the AI-based

software, their input data is encrypted using that self-

generated encryption key and sent to the evaluation

enclave. At model evaluation, the data is decrypted

within the secure evaluation enclave and processed

based on the MO-QDO agreement. The processing

result can be encrypted and sent to the MO, QDO, or

both, based on agreed access control policy.

Our protocol achieves the following objectives:

Security for Distributed Machine Learning

839

Figure 1: Overall Protocol.

• AI-Based Software IP Safeguarding: We de-

ploy the AI-based software on TEE to protect the

model’s IP and leverage co-located GPU acceler-

ation extending SLALOM protocol.

• Data Privacy: Input and output data of the AI-

based software are encrypted to ensure data pri-

vacy.

3.3 Distributed QDO Key Management

Our approach requires each QDO to encrypt their

data using their own cryptographic material. To en-

sure proper key management and control, we encap-

sulate the key material within dedicated secure en-

claves co-located with the AI-based software enclave.

The cryptographic material is securely deployed on

the edge device through local attestation between

the cryptographic material enclaves and the AI-based

software enclave.

3.4 Model and Query Data Protection

During the evaluation phase, we adopt the ”slicing

network” technique, as described in SLALOM, to

split the execution of the AI-based software model.

Non-linear layers are processed within the enclave

on the CPU, while linear layers are delegated to a

co-located GPU. To ensure data privacy, we encrypt

the input of the linear layers using a pre-computed

pseudo-random stream. In SLALOM, only the data

inputs delegated to the GPU are encrypted, while the

model parameters (e.g., weights, bias) are sent in clear

text.

In our proposed approach, we guarantee privacy

of model parameters by enhance SLALOM algorithm

as follows:

1: Linear Layer Evaluation:

2: u1 = r1 ∗ W (TEE; pre-computed)

3: u2 = r1 ∗ r2 (TEE; pre-computed)

4: u3 = r2 ∗ x (TEE)

5: ¯x = x + r1;

W = W + r2 (TEE)

6: ¯y = ¯x ∗

W (GPU)

7: y = ¯y − u1 − u2 − u3 (TEE)

8: assert Freivalds(y, x, W ) (TEE)

9: Non-Linear Layer Evaluation:

10: x = F

non−linear

(y)

In this enhanced algorithm, for linear layer eval-

uation, both the model parameters W and the input

data x are encrypted within the TEE (lines 1-5), with

r1 and r2 pseudo-random streams. The evaluation of

linear layers is then delegated to the GPU with the

encrypted inputs (line 6). The resulting output is de-

crypted within the TEE (line 7).

While this approach ensures model and data pri-

vacy during the delegation of linear layers, it intro-

duces additional overhead due to the decryption pro-

cess within the AI-based software enclave. The im-

pact of this overhead is evaluated in Section 4.

3.5 Query Output Access Control

The AI-based software evaluates the access control

policy on evaluation outcome, as agreed between the

QDO and MO. The overhead introduced by AES-256

encryption is negligible, as it is performed only once

per model evaluation.

SECRYPT 2023 - 20th International Conference on Security and Cryptography

840

4 EVALUATION

In this section, we discuss the evaluation of a demon-

stration implementing our approach to a smart city

use case for risk prevention in public spaces. We elab-

orate more on our technical architecture and perfor-

mance evaluation of this demonstrator.

4.1 Risk Prevention in Public Spaces

We assess the performance and feasibility of embed-

ding AI-based motion detection within a CCTV cam-

era in the city of Antibes for risk prevention in public

spaces. The encrypted video stream is processed lo-

cally on the camera. Motion detection classiﬁcation

(e.g., jumping, falling, cycling) triggers automated

alerts to public safety and security forces. Our aim

is to optimize the use of Public Safety & Security re-

sources and maximize the value of existing CCTV

systems. Following the previously introduced ter-

minology on Model Owner and Query Data Owner,

we consider the ML-based motion detection software

provider as the MO and the smart city as the QDO

(e.g video stream owner).

4.2 ST-GCN-Based Motion Detection

Motion detection is crucial for analyzing human ac-

tivities and identifying incidents in hazardous ar-

eas. In our demonstrator, we employ a Spatial-

Temporal Graph Convolutional Network (ST-GCN)

(Yan et al., 2018). ST-GCN combines Graph Con-

volutional Networks (GCNs) and Temporal Convolu-

tional Networks (TCNs) to enhance motion detection

precision. Notable ST-GCN models include ASGCN

(Shi et al., 2019), and ST-TR (Plizzari et al., 2021).

4.3 Overall Protocol

In Figure 3, motion detection process is broken down

into two main steps: (i) extraction of human skeletons

as graph of joints, (ii) ST-GCN-based human motion

detection (e.g., standing, sitting, walking).

4.3.1 Graph-Based Skeleton Extraction

As implemented by (Yan et al., 2018), we em-

ploy OpenPose (Cao et al., 2019) to extract graph-

based skeletons from video streams. OpenPose is

a real-time multi-person keypoint detection library

that combines pose estimation and image processing

techniques for estimating 2D poses of multiple indi-

viduals in an image. The keypoints represent joints

and body parts (e.g., head, elbows, hips, or knees),

and OpenPose generates a graph-based representa-

tion of these interconnected keypoints through post-

processing. We perform graph-based skeleton detec-

tion using OpenPose outside of the TEE due to the

complexity of implementation and dependencies on

third-party libraries, which deter code migration into

enclaves.

4.3.2 ST-GCN Implementation

Our evaluation is made on the deployment of the

ST-GCN model within an enclave on Jetson Xavier

AGX (NVIDIA, 2019). The ST-GCN model com-

prises Convolution Layers and an average pooling

layer. GCN performs Spatial Graph Convolution by

taking feature maps and a normalized weighted adja-

cency matrix as inputs. It multiplies the feature maps

with the adjacency matrix to process the data in each

frame. TCN operates like typical CNNs, extracting

features. The Residual Networks’ output is bitwise

added to the result of GCN+TCN, addressing gradient

issues, preserving inference accuracy, and preventing

network degradation. Figure 2 illustrates the ST-GCN

model architecture.

4.4 Hardware Setup

We implement our prototype on an NVIDIA Jetson

AGX Xavier platform (NVIDIA, 2019). This edge

device is equipped with an ARM-based TEE, based

on TrustZone Technology. TEE functionalities for

enclave generation & execution, local and remote

attestation are abstracted through the software ab-

straction layer Trusty (Android Open Source Project,

2019). Trusty is part of the Android Open Source

Project (AOSP). Hardware-wise, Jetson Xavier AGX

is equipped with 64 GPU tensor cores, 8 CPU tensor

cores. With 32 GB of RAM, this device can allocate

up to 128 MB of RAM to Trusty, for secure enclave

development. This limits the processing capabilities

within enclaves. Further technical speciﬁcations can

be found here (NVIDIA, 2019).

4.5 Technical Architecture

4.5.1 Distributed QDO Key Management

To ensure query data conﬁdentiality and privacy, each

Query Data Owner (QDO) encrypts their data sym-

metrically using self-generated AES-128 keys, em-

bedded in a key enclave. Remote attestation, with

RSA-2048 encryption, is used for key enclave deploy-

ment on the edge device (step 2 in Figure 1). AES-128

keys are hard-coded in the key enclaves for simplicity

Security for Distributed Machine Learning

841

Figure 2: Architecture.

in our experiment. Key enclaves guarantee key con-

ﬁdentiality and integrity. The evaluation enclave de-

crypts QDO inputs (step 3 in Figure 1) by extracting

cryptographic materials from key enclaves. Local at-

testation, based on RSA-2048, enables key exchange

between key and model evaluation enclaves.

Communication overhead from key exchange is

minimal, occurring once per ML model evaluation.

It is insigniﬁcant compared to encryption and com-

munication iterations between the evaluation enclave

and the GPU.

4.5.2 Protection of AI-Based Software

Intellectual Property

Figure 1 depicts the secure deployment and evaluation

of the ST-GCN model within an enclave for model

integrity and conﬁdentiality on the edge device. In

step 4, we use the ”slicing network” technique to ac-

celerate model evaluation with the co-located GPU.

Linear layers (convolutional layers) are executed on

the co-located GPU for hardware acceleration, while

non-linear layers (RELU, Batch Normalization, and

Average Pooling) are executed in the AI-based Soft-

ware Enclave.

To protect the privacy and conﬁdentiality of model

parameters and input data, we employ a pseudo-

random stream encryption technique. It encrypts

delegated data and parameters using random data

streams generated uniformly, with element-wise ad-

dition to the plaintext. Using a secure enclave and

co-located GPUs for data privacy and IP protection

increases communication overhead between the en-

clave and GPU, impacting performance due to addi-

tional encryption and decryption cycles and random

data generation.

4.5.3 Query Output Fairness Access

In our use case, the model evaluation outcome is lim-

ited to the city and intended for police forces and pub-

lic safety services. We encrypt the outcome using an

AES-128 encryption key owned by the city of An-

tibes, embedded within the evaluation enclave. The

impact on prototype performance is minimal, occur-

ring only once.

4.6 Results

We assess the performance of our demonstrator in

terms of processing time, communication cost, and

model accuracy to evaluate the feasibility of our ap-

proach in a real use case. We evaluate our pro-

totype dynamically, without pre-computed optimiza-

tions. All random number generation and matrix op-

erations are performed at run-time.

4.6.1 Processing Overhead

Excluding OpenPose execution, we measure the pro-

cessing overhead of graph-based skeleton evaluation

using the ST-GCN model. Speciﬁcally, we focus on

GPU-based linear layer execution, TEE-based non-

linear layer execution, enclave-GPU communication,

and encryption/decryption rounds. Our TEE-based

implementation adds 28.7 seconds to the evaluation

time, compared to the original ST-GCN model’s 0.8

to 1.2 seconds for a 10-second video stream. The

SECRYPT 2023 - 20th International Conference on Security and Cryptography

842

breakdown of processing time is as follows: 2% for

linear layer execution, 55% for non-linear layer ex-

ecution, 15% for communication, and 20% for data

encryption/decryption. The lack of GPU acceleration

for executing non-linear layers within the secure en-

clave contributes to over half of the processing over-

head.

4.6.2 Communication Cost

The ”slicing technique” necessitates frequent commu-

nication between the secure enclave and GPU. When

the model evaluation enclave delegates linear layer

processing to the GPU, the layer parameters and in-

put data must be transferred to the GPU. After pro-

cessing, the GPU-based application sends the output

back to the evaluation enclave. Communication be-

tween the enclaves and GPU is limited to a maximum

payload size of 4 pages or 4KB. Considering that the

average size of a linear layer is 3.375 MB, delegating

a single linear layer leads to around 863 communi-

cation rounds between the TEE and GPU. This high

number of communication rounds highlights the po-

tential overhead of the ”slicing technique”.

4.6.3 Accuracy Loss

We evaluate motion detection accuracy by comparing

the top-5 predicted labels and mean losses of our im-

plementation to the original model. Our implementa-

tion achieves a 98% match with the top-5 predictions,

demonstrating similar accuracy to the original model.

Additionally, we assess the mean loss of our imple-

mentation compared to the original model. We ob-

serve an average accuracy loss of 5% resulting from

approximating the use of 2-digit ﬂoating points to re-

duce communication rounds between the TEE and

GPU.

5 CONCLUSION

In this paper, we propose a solution to protect AI

based software’s Intellectual Property and preserve

data privacy. We evaluate our approach in a real-

world scenario of risk prevention in public spaces,

embedding ML-based motion detection within CCTV

cameras. Despite increased processing time, our ap-

proach demonstrates feasibility without signiﬁcant

accuracy loss. We also identify potential optimiza-

tion on communication rounds between the secure en-

clave and the GPU, such as pre-computation of ran-

dom stream generation. As future work, we plan to

extend our approach to cloud conﬁdential computing,

addressing security threats in edge device TEE and

improving processing time. Initial experiments in this

direction show promising results.

REFERENCES

Android Open Source Project (2019). Trusty.

”source.android.com/docs/security/features/trusty”.

Brakerski, Z., Gentry, C., and Vaikuntanathan, V. (2012).

Fully homomorphic encryption without bootstrap-

ping. In 3rd Innovations in Theoretical Computer Sci-

ence Conference.

Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., and

Sheikh, Y. A. (2019). Openpose: Realtime multi-

person 2d pose estimation using part afﬁnity ﬁelds.

IEEE Transactions on Pattern Analysis and Machine

Intelligence.

Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and

Prouff, E. (2017). Privacy-preserving classiﬁcation

on deep neural network. IACR Cryptol. ePrint Arch.,

page 35.

Dowlin, N., Gilad-Ba, R., Laine, K., Lauter, K., Naehrig,

M., and Wernsing, J. (2016). Cryptonets: Applying

neural networks to encrypted data with high through-

put and accuracy. In ICML’16: Proceedings of the

33rd International Conference on International Con-

ference on Machine Learning.

Fan, J. and Vercauteren, F. (2012). Somewhat practical fully

homomorphic encryption. IACR Cryptology ePrint

Archive.

Gupta, O. and Raskar, R. (2018). Distributed learning of

deep neural network over multiple agents. Journal of

Network and Computer Applications, 116:1–8.

Juvekar, C., Vaikuntanathan, V., and Chandrakasan, A.

(2018). Gazelle: A low latency framework for secure

neural network inference. In 27th USENIX Security

Symposium.

NVIDIA (2019). Jetson XAVIER AGX.

”https://www.nvidia.com/fr-fr/autonomous-

machines/embedded-systems/jetson-agx-xavier/”.

Plizzari, C., Cannici, M., and Matteucci, M. (2021). Spatial

temporal transformer network for skeleton-based ac-

tion recognition. In Pattern Recognition. ICPR Inter-

national Workshops and Challenges, pages 694–701,

Cham. Springer International Publishing.

Shi, L., Zhang, Y., Cheng, J., and Lu, H. (2019). Skeleton-

based action recognition with directed graph neural

networks. In Proceedings of the IEEE/CVF Con-

ference on Computer Vision and Pattern Recognition

(CVPR).

Tram

er, F. and Boneh, D. (2018). Slalom: Fast, veriﬁable

and private execution of neural networks in trusted

hardware.

Yan, S., Xiong, Y., and Lin, D. (2018). Spatial temporal

graph convolutional networks for skeleton-based ac-

tion recognition. In Proceedings of the Thirty-Second

Annual Conference on Innovative Applications of

Artiﬁcial Intelligence, AAAI’18/IAAI’18/EAAI’18.

AAAI Press.

Security for Distributed Machine Learning

843