generated only once. Of course, if there is an incre-
ment in the video dataset, the graph must be generated
again, but this factor affects any other classification
method from the literature.
Thus, in general, the proposed approach proved
to be promising when compared with competing lit-
erature methods. In addition, it opens up many pos-
sibilities for future modifications, improvements and
analyzes. It is relevant to consider that it is possible
to try methods working in batch with models based
on geometric deep learning, providing other ways to
gain even more flexibility.
ACKNOWLEDGEMENTS
This work was supported by Coordination for the Im-
provement of Higher Education Personnel (CAPES),
National Council of Scientific and Technological De-
velopment (CNPq), Fundac¸
˜
ao Arauc
´
aria, SETI and
UTFPR.
REFERENCES
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., and
Baskurt, A. (2010). Action classification in soccer
videos with long short-term memory recurrent neural
networks. In ICANN, pages 154–159. Springer Berlin
Heidelberg.
Bruna, J., Zaremba, W., Szlam, A., and Lecun, Y. (2014).
Spectral networks and locally connected networks on
graphs. In ICLR, pages 1–14.
Dollar, P., Rabaud, V., Cottrell, G., and Belongie, S. (2005).
Behavior recognition via sparse spatio-temporal fea-
tures. In PICCN, pages 65–72. IEEE Computer Soci-
ety.
Donahue, J., Hendricks, L. A., Rohrbach, M., Venugopalan,
S., Guadarrama, S., Saenko, K., and Darrell, T.
(2014). Long-term Recurrent Convolutional Networks
for Visual Recognition and Description. arXiv e-
prints, page arXiv:1411.4389.
Duvenaud, D. K., Maclaurin, D., Iparraguirre, J., Bom-
barell, R., Hirzel, T., Aspuru-Guzik, A., and Adams,
R. P. (2015). Convolutional networks on graphs for
learning molecular fingerprints. In NIPS, pages 2224–
2232. Curran Associates, Inc.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep Resid-
ual Learning for Image Recognition. arXiv e-prints,
page arXiv:1512.03385.
Henaff, M., Bruna, J., and LeCun, Y. (2015). Deep con-
volutional networks on graph-structured data. CoRR,
abs/1506.05163.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural Computing, 9(8):1735–1780.
Jain, M., J
´
egou, H., and Bouthemy, P. (2013). Better
exploiting motion for better action recognition. In
CVPR, pages 2555–2562.
Ji, S., Xu, W., Yang, M., and Yu, K. (2013). 3d convolu-
tional neural networks for human action recognition.
IEEE TPAMI, 35(1):221–231.
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Suk-
thankar, R., and Fei-Fei, L. (2014). Large-scale video
classification with convolutional neural networks. In
CVPR, pages 1725–1732.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. In NIPS, pages 1097–1105. Curran Asso-
ciates, Inc.
Laptev, I. (2005). On space-time interest points. Interna-
tional Journal of Computer Vision, 64(2):107–123.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. In Proceedings of the IEEE, pages 2278–2324.
Li, Y., Tarlow, D., Brockschmidt, M., and Zemel, R. S.
(2016). Gated graph sequence neural networks.
CoRR, abs/1511.05493.
Lu
´
ıs Estevam Junior, V., Pedrini, H., and Menotti, D.
(2019). Zero-Shot Action Recognition in Videos: A
Survey. arXiv e-prints, page arXiv:1909.06423.
Luo, Z., Jiang, L., Hsieh, J.-T., Niebles, J. C., and Li,
F. F. (2018). Graph distillation for action detection
with privileged information. In Proceedings of ECCV,
pages 1–18.
Murtagh, F. (1991). Multilayer perceptrons for classifica-
tion and regression. Neurocomputing, 2(5):183 – 197.
Simonyan, K. and Zisserman, A. (2014). Two-stream con-
volutional networks for action recognition in videos.
In NIPS, pages 568–576. Curran Associates, Inc.
Soomro, K., Zamir, A. R., and Shah, M. (2012). Ucf101:
A dataset of 101 human actions classes from videos in
the wild. CoRR, abs/1212.0402.
Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri,
M. (2014). Learning Spatiotemporal Features with
3D Convolutional Networks. arXiv e-prints, page
arXiv:1412.0767.
Wang, H., Kl
¨
aser, A., Schmid, C., and Liu, C. (2011). Ac-
tion recognition by dense trajectories. In CVPR, pages
3169–3176.
Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., and Wein-
berger, K. (2019). Simplifying graph convolutional
networks. In ICML, pages 6861–6871. PMLR.
Yue-Hei Ng, J., Hausknecht, M., Vijayanarasimhan, S.,
Vinyals, O., Monga, R., and Toderici, G. (2015). Be-
yond Short Snippets: Deep Networks for Video Clas-
sification. arXiv e-prints, page arXiv:1503.08909.
Zhang, H.-B., Zhang, Y.-X., Zhong, B., Lei, Q., Yang, L.,
Du, J.-X., and Chen, D.-S. (2019). A comprehen-
sive survey of vision-based human action recognition
methods. Sensors, 19:1005.
Video Action Classification through Graph Convolutional Networks
497