breakdown of processing time is as follows: 2% for
linear layer execution, 55% for non-linear layer ex-
ecution, 15% for communication, and 20% for data
encryption/decryption. The lack of GPU acceleration
for executing non-linear layers within the secure en-
clave contributes to over half of the processing over-
head.
4.6.2 Communication Cost
The ”slicing technique” necessitates frequent commu-
nication between the secure enclave and GPU. When
the model evaluation enclave delegates linear layer
processing to the GPU, the layer parameters and in-
put data must be transferred to the GPU. After pro-
cessing, the GPU-based application sends the output
back to the evaluation enclave. Communication be-
tween the enclaves and GPU is limited to a maximum
payload size of 4 pages or 4KB. Considering that the
average size of a linear layer is 3.375 MB, delegating
a single linear layer leads to around 863 communi-
cation rounds between the TEE and GPU. This high
number of communication rounds highlights the po-
tential overhead of the ”slicing technique”.
4.6.3 Accuracy Loss
We evaluate motion detection accuracy by comparing
the top-5 predicted labels and mean losses of our im-
plementation to the original model. Our implementa-
tion achieves a 98% match with the top-5 predictions,
demonstrating similar accuracy to the original model.
Additionally, we assess the mean loss of our imple-
mentation compared to the original model. We ob-
serve an average accuracy loss of 5% resulting from
approximating the use of 2-digit floating points to re-
duce communication rounds between the TEE and
GPU.
5 CONCLUSION
In this paper, we propose a solution to protect AI
based software’s Intellectual Property and preserve
data privacy. We evaluate our approach in a real-
world scenario of risk prevention in public spaces,
embedding ML-based motion detection within CCTV
cameras. Despite increased processing time, our ap-
proach demonstrates feasibility without significant
accuracy loss. We also identify potential optimiza-
tion on communication rounds between the secure en-
clave and the GPU, such as pre-computation of ran-
dom stream generation. As future work, we plan to
extend our approach to cloud confidential computing,
addressing security threats in edge device TEE and
improving processing time. Initial experiments in this
direction show promising results.
REFERENCES
Android Open Source Project (2019). Trusty.
”source.android.com/docs/security/features/trusty”.
Brakerski, Z., Gentry, C., and Vaikuntanathan, V. (2012).
Fully homomorphic encryption without bootstrap-
ping. In 3rd Innovations in Theoretical Computer Sci-
ence Conference.
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., and
Sheikh, Y. A. (2019). Openpose: Realtime multi-
person 2d pose estimation using part affinity fields.
IEEE Transactions on Pattern Analysis and Machine
Intelligence.
Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and
Prouff, E. (2017). Privacy-preserving classification
on deep neural network. IACR Cryptol. ePrint Arch.,
page 35.
Dowlin, N., Gilad-Ba, R., Laine, K., Lauter, K., Naehrig,
M., and Wernsing, J. (2016). Cryptonets: Applying
neural networks to encrypted data with high through-
put and accuracy. In ICML’16: Proceedings of the
33rd International Conference on International Con-
ference on Machine Learning.
Fan, J. and Vercauteren, F. (2012). Somewhat practical fully
homomorphic encryption. IACR Cryptology ePrint
Archive.
Gupta, O. and Raskar, R. (2018). Distributed learning of
deep neural network over multiple agents. Journal of
Network and Computer Applications, 116:1–8.
Juvekar, C., Vaikuntanathan, V., and Chandrakasan, A.
(2018). Gazelle: A low latency framework for secure
neural network inference. In 27th USENIX Security
Symposium.
NVIDIA (2019). Jetson XAVIER AGX.
”https://www.nvidia.com/fr-fr/autonomous-
machines/embedded-systems/jetson-agx-xavier/”.
Plizzari, C., Cannici, M., and Matteucci, M. (2021). Spatial
temporal transformer network for skeleton-based ac-
tion recognition. In Pattern Recognition. ICPR Inter-
national Workshops and Challenges, pages 694–701,
Cham. Springer International Publishing.
Shi, L., Zhang, Y., Cheng, J., and Lu, H. (2019). Skeleton-
based action recognition with directed graph neural
networks. In Proceedings of the IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition
(CVPR).
Tram
`
er, F. and Boneh, D. (2018). Slalom: Fast, verifiable
and private execution of neural networks in trusted
hardware.
Yan, S., Xiong, Y., and Lin, D. (2018). Spatial temporal
graph convolutional networks for skeleton-based ac-
tion recognition. In Proceedings of the Thirty-Second
Annual Conference on Innovative Applications of
Artificial Intelligence, AAAI’18/IAAI’18/EAAI’18.
AAAI Press.
Security for Distributed Machine Learning
843