Table 5: Efficiency comparaison between VGG16, IFA
2.1
and IFA
5.1
.
Model Test Accuracy Training Cost BCR
tc
acc
Floating Point Operations BCR
f po
acc
VGG16 95.22% 2.26e+13 4.21e-12 2.52e+09 3.79e-08
IFA
2.1
93.77% 7.62e+10 1.23e-09 9.95e+07 9.42e-07
IFA
5.1
94.47% 4.04e+12 2.34e-11 2.30e+08 4.10e-07
curacy by 1.45% with a 300 times larger network that
consumes 14 times more energy is not such a good
deal.
6 CONCLUSION
In this research, we apply the Improved Firefly Algo-
rithm, a training-free neural architecture search tech-
nique, to automate the model design and find the most
suitable neural network for the chosen dataset. This
method produced interesting results for on the Online
Action Data (OAD) dataset in 48 minutes on a sin-
gle CPU, improving the baseline NAS (using train-
ing over 100 epochs) by a factor of 476 (GPU based)
and a factor of 4400 (CPU based), while improving
(Wang et al., 2023) proposition by a factor of 114.
this method is also 170 times less power consuming
and polluting than training-based NAS using GPU.
While achieving high performance, it is important
to take into account the efficiency of neural networks,
which are growing exponentially. With this in con-
sideration, we propose the benefit cost ratio (BCR), a
metric to evaluate the quality of a neural network in
terms of its performance, but also its cost.
Experimentation on the Online Action Detection
dataset showed that using the IFA provides a little
lower performing model (93.77% of accuracy 95.22%
from the state-of-the-art) but allows reducing the
computation cost in terms of time by a factor of 2.8,
and 14 times in terms of energy consumption and
pollution, by producing a neural network that is 300
times smaller than the VGG16 model.
As a future work, we consider using the BCR as
the fitness of the Improved FireFly algortihm, to bring
the aspect of efficiency into the architecture search.
ACKNOWLEDGEMENTS
This work has been carried out within the French-
Canadian project DOMAID which is funded by the
National Agency for Research (ANR-20-CE26-0014-
01) and the FRQSC
REFERENCES
Carvalho, A., Ramos, F., and Chaves, A. (2010). Meta-
heuristics for the feedforward artificial neural network
(ann) architecture optimization problem. Neural Com-
puting and Applications, 20.
Chen, S., Xu, K., Jiang, X., and Sun, T. (2022). Pyramid
spatial-temporal graph transformer for skeleton-based
action recognition. Applied Sciences, 12(18):9229.
Delamare, M., Laville, C., Cabani, A., and Chafouk, H.
(2021). Graph convolutional networks skeleton-based
action recognition for continuous data stream: A slid-
ing window approach. In Proceedings of the 16th
International Joint Conference on Computer Vision,
Imaging and Computer Graphics Theory and Applica-
tions - Volume 4: VISAPP,, pages 427–435. INSTICC,
SciTePress.
Elsken, T., Metzen, J. H., and Hutter, F. (2019). Neural
architecture search: A survey. Journal of Machine
Learning Research, 20(55):1–21.
Lannelongue, L., Grealey, J., and Inouye, M. (2021). Green
algorithms: Quantifying the carbon footprint of com-
putation. Advanced Science, 8(12):2100707.
Laraba, S., Brahimi, M., Tilmanne, J., and Dutoit, T. (2017).
3d skeleton-based action recognition by representing
motion capture sequences as 2d-rgb images. Com-
puter Animation and Virtual Worlds, 28.
Li, C., Zhong, Q., Xie, D., and Pu, S. (2018). Co-occurrence
feature learning from skeleton data for action recog-
nition and detection with hierarchical aggregation. In
Proceedings of the Twenty-Seventh International Joint
Conference on Artificial Intelligence, IJCAI-18, pages
786–792. International Joint Conferences on Artificial
Intelligence Organization.
Li, Y., Lan, C., Xing, J., Zeng, W., Yuan, C., and Liu, J.
(2016). Online human action detection using joint
classification-regression recurrent neural networks. In
Leibe, B., Matas, J., Sebe, N., and Welling, M., edi-
tors, Computer Vision – ECCV 2016, pages 203–220,
Cham. Springer International Publishing.
Liu, J., Shahroudy, A., Wang, G., Duan, L.-Y., and Kot, A.
(2019). Skeleton-based online action prediction using
scale selection network. IEEE Transactions on Pattern
Analysis and Machine Intelligence, PP.
Liu, J., Shahroudy, A., Xu, D., Kot, A., and Wang,
G. (2018). Skeleton-based action recognition using
spatio-temporal lstm network with trust gates. IEEE
Transactions on Pattern Analysis and Machine Intel-
ligence, 40:3007–3021.
Liu, J., Wang, G., Hu, P., Duan, L.-Y., and Kot, A. C.
(2017a). Global context-aware attention lstm net-
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
364