Hommos, O., Pintea, S. L., Mettes, P. S., and van
Gemert, J. C. (2018). Using Phase Instead of Op-
tical Flow for Action Recognition. arXiv preprint
arXiv:1809.03258.
Ji, S., Xu, W., Yang, M., and Yu, K. (2013). 3D Convo-
lutional Neural Networks for Human Action Recog-
nition. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 35(1):221–231.
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Suk-
thankar, R., and Fei-Fei, L. (2014). Large-Scale Video
Classification with Convolutional Neural Networks.
In IEEE Conference on Computer Vision and Pattern
Recognition, pages 1725–1732.
Kuehne, H., Jhuang, H., Stiefelhagen, R., and Serre, T.
(2013). HMDB51: A Large Video Database for Hu-
man Motion Recognition. In High Performance Com-
puting in Science and Engineering, pages 571–582.
Springer.
Ma, C.-Y., Chen, M.-H., Kira, Z., and AlRegib, G. (2019).
TS-LSTM and Temporal-Inception: Exploiting Spa-
tiotemporal Dynamics for Activity Recognition. Sig-
nal Processing: Image Communication, 71:76–87.
Moreira, T., Menotti, D., and Pedrini, H. (2017). First-
Person Action Recognition Through Visual Rhythm
Texture Description. In IEEE International Confer-
ence on Acoustics, Speech and Signal Processing,
pages 2627–2631. IEEE.
Murofushi, T. and Sugeno, M. (1989). An Interpretation
of Fuzzy Measures and the Choquet Integral As an
Integral with Respect to a Fuzzy Measure. Fuzzy Sets
and Systems, 29(2):201–227.
Murofushi, T. and Sugeno, M. (2000). Fuzzy Measures and
Fuzzy Integrals. In Grabisch, M., Murofushi, T., and
Sugeno, M., editors, Fuzzy Measures and Integrals –
Theory and Applications, pages 3–41. Physica Verlag,
Heidelberg.
Ng, J. Y.-H., Hausknecht, M., Vijayanarasimhan, S.,
Vinyals, O., Monga, R., and Toderici, G. (2015). Be-
yond Short Snippets: Deep Networks for Video Clas-
sification. In IEEE Conference on Computer Vision
and Pattern Recognition, pages 4694–4702.
Rao, B. S. (2018). A Fuzzy Fusion Approach for Mod-
ified Contrast Enhancement Based Image Forensics
Against Attacks. Multimedia Tools and Applications,
77(5):5241–5261.
Ryoo, M. S. and Matthies, L. (2016). First-Person Ac-
tivity Recognition: Feature, Temporal Structure, and
Prediction. International Journal of Computer Vision,
119(3):307–328.
Santos, A., Paiva, J., Toledo, C., and Pedrini, H. (2016).
Improved Human Skin Segmentation Using Fuzzy Fu-
sion Based on Optimized Thresholds by Genetic Al-
gorithms. In Hybrid Soft Computing for Image Seg-
mentation, pages 185–207. Springer.
Santos, A. and Pedrini, H. (2019). Spatio-Temporal Video
Autoencoder for Human Action Recognition. In 14th
International Joint Conference on Computer Vision,
Imaging and Computer Graphics Theory and Appli-
cations, pages 114–123, Prague, Czech Republic.
Simonyan, K. and Zisserman, A. (2014). Two-Stream
Convolutional Networks for Action Recognition in
Videos. In Ghahramani, Z., Welling, M., Cortes, C.,
Lawrence, N., and Weinberger, K., editors, Advances
in Neural Information Processing Systems 27, pages
568–576. Curran Associates, Inc.
Soomro, K., Zamir, A. R., and Shah, M. (2012). UCF101:
A Dataset of 101 Human Actions Classes from Videos
in the Wild. arXiv preprint arXiv:1212.0402.
Soria-Frisch, A. (2004). Soft Data Fusion for Computer
Vision. Fraunhofer-IRB-Verlag.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna,
Z. (2016). Rethinking the Inception Architecture for
Computer Vision. In IEEE Conference on Computer
Vision and Pattern Recognition, pages 2818–2826.
Szeliski, R. (2010). Computer Vision: Algorithms and Ap-
plications. Springer Science & Business Media.
Tahani, H. and Keller, J. M. (1990). Information Fu-
sion in Computer Vision using the Fuzzy Integral.
IEEE Transactions on Systems, Man, and Cybernet-
ics, 20(3):733–741.
Wang, L., Ge, L., Li, R., and Fang, Y. (2017). Three-Stream
CNNs for Action Recognition. Pattern Recognition
Letters, 92:33–40.
Wang, L., Xiong, Y., Wang, Z., and Qiao, Y. (2015). To-
wards Good Practices for very Deep Two-Stream Con-
vnets. arXiv preprint arXiv:1507.02159.
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang,
X., and Van Gool, L. (2016). Temporal Segment
Networks: Towards Good Practices for Deep Action
Recognition. In European Conference on Computer
Vision, pages 20–36. Springer.
Zach, C., Pock, T., and Bischof, H. (2007). A Duality
Based Approach for Realtime TV-L
1
Optical Flow.
In Joint Pattern Recognition Symposium, pages 214–
223. Springer.
Zhu, Y., Lan, Z., Newsam, S., and Hauptmann, A. G.
(2017). Hidden Two-Stream Convolutional Net-
works for Action Recognition. arXiv preprint
arXiv:1704.00389.
Fuzzy Fusion for Two-stream Action Recognition
123