mance. For the UCF-24 dataset, we are able to speed
up (theoretically) the model by a factor of 6.6 while
only reducing the mAP by about 4%.
This result can also be observed in the energy con-
sumption, where there’s a reduction by a factor of
3.2. This significant decrease is especially vital con-
sidering energy consumption is a crucial factor for
edge deployment, enabling the model to be utilized on
battery-powered devices. By examining the memory
usage and latency of the deployed models, it’s clear
that the compressed models are more suitable for edge
deployment.
For future work, other pruning criteria and
scheduling methods can be investigated. Currently,
we used sensitivity analysis to determine a pruning
schedule. However, this is not the optimal way to de-
termine the schedule. This could potentially be au-
tomated, as done in the field of AutoML. For exam-
ple, in the work of (He et al., 2018), a reinforcement
learning approach is used. Furthermore, we used a
static post-training quantization method. In future
work, we could investigate the use of quantization-
aware training. Finally, the quantization results can
also be improved by better software support, as dis-
cussed in Section 3.3.
ACKNOWLEDGEMENT
This research is partly supported by the FWO
SBO Fellowship 1SA8124N ”Knowledge Based Neu-
ral Network Compression: Context-Aware Model
Abstractions” and by the Flanders Innovation En-
trepreneurship (VLAIO) IMEC.ICON project no.
HBC.2021.0658 BoB.
REFERENCES
Bartoldson, B., Morcos, A., Barbu, A., and Erlebacher, G.
(2020). The generalization-stability tradeoff in neu-
ral network pruning. In Larochelle, H., Ranzato,
M., Hadsell, R., Balcan, M. F., and Lin, H., editors,
Advances in Neural Information Processing Systems,
volume 33, pages 20852–20864. Curran Associates,
Inc.
Braun, A., Tuttas, S., Borrmann, A., and Stilla, U.
(2020). Improving progress monitoring by fusing
point clouds, semantic data and computer vision. Au-
tom. Constr., 116:103210.
Crowley, E. J., Turner, J., Storkey, A., and O’Boyle, M.
(2018). A closer look at structured pruning for neural
network compression. pages 1–12.
Feichtenhofer, C., Fan, H., Malik, J., and He, K. (2018).
SlowFast networks for video recognition.
Feichtenhofer, C., Pinz, A., and Zisserman, A. (2016). Con-
volutional Two-Stream network fusion for video ac-
tion recognition.
Frankle, J. and Carbin, M. (2018). The lottery ticket hypoth-
esis: Finding sparse, trainable neural networks. pages
1–42.
Georgiadis, G. (2018). Accelerating Convolutional Neu-
ral Networks via Activation Map Compression. Pro-
ceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition, 2019-
June:7078–7088.
Han, S., Mao, H., and Dally, W. J. (2016). Deep compres-
sion: Compressing deep neural network with pruning,
trained quantization and huffman coding. In 4th In-
ternational Conference on Learning Representations,
ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016,
Conference Track Proceedings.
He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., and Han, S.
(2018). AMC: AutoML for model compression and
acceleration on mobile devices.
He, Y., Zhang, X., and Sun, J. (2017). Channel pruning for
accelerating very deep neural networks. In 2017 IEEE
International Conference on Computer Vision (ICCV),
volume 2017-Octob, pages 1398–1406. IEEE.
IMEC (2023). imec.icon project - BoB. https://www.
imec-int.com/en/research-portfolio/bob. Accessed:
2023-11-13.
Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard,
A., Adam, H., and Kalenichenko, D. (2017). Quan-
tization and training of neural networks for efficient
Integer-Arithmetic-Only inference.
Jhuang, H., Gall, J., Zuffi, S., Schmid, C., and Black, M. J.
(2013). Towards understanding action recognition. In
2013 IEEE International Conference on Computer Vi-
sion, pages 3192–3199. IEEE.
K
¨
op
¨
ukl
¨
u, O., Wei, X., and Rigoll, G. (2019). You only
watch once: A unified CNN architecture for Real-
Time spatiotemporal action localization.
Krishnamoorthi, R. (2018). Quantizing deep convolutional
networks for efficient inference: A whitepaper.
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., and Serre,
T. (2011). HMDB: A large video database for human
motion recognition. In 2011 International Conference
on Computer Vision, pages 2556–2563. IEEE.
Lee, N., Ajanthan, T., and Torr, P. H. S. (2018). SNIP:
single-shot network pruning based on connection sen-
sitivity. CoRR, abs/1810.02340.
Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf,
H. P. (2016). Pruning Filters for Efficient ConvNets.
5th International Conference on Learning Represen-
tations, ICLR 2017 - Conference Track Proceedings,
(2016):1–13.
Liu, C. and Wu, H. (2019). Channel pruning based on
mean gradient for accelerating Convolutional Neural
Networks. Signal Processing, 156:84–91.
Loshchilov, I. and Hutter, F. (2017). Decoupled weight de-
cay regularization.
ONNX (2023). Open neural network exchange. https://
onnx.ai/. Accessed: 2023-11-13.
Deep Learning Model Compression for Resource Efficient Activity Recognition on Edge Devices: A Case Study
583