Carlevaris-Bianco, N., Ushani, A. K., and Eustice, R. M.
(2015). University of Michigan North Campus long-
term vision and lidar dataset. International Journal of
Robotics Research, 35(9):1023–1035.
Cort
´
es, S., Solin, A., Rahtu, E., and Kannala, J. (2018). Ad-
vio: An authentic dataset for visual-inertial odometry.
In Proceedings of the European Conference on Com-
puter Vision (ECCV), pages 419–434.
Engel, J., Usenko, V., and Cremers, D. (2016). A photo-
metrically calibrated benchmark for monocular visual
odometry. arXiv preprint arXiv:1607.02555.
Escalera, S., Gonz
`
alez, J., Bar
´
o, X., Reyes, M., Guyon,
I., Athitsos, V., Escalante, H., Sigal, L., Argyros, A.,
Sminchisescu, C., et al. (2013). Chalearn multi-modal
gesture recognition 2013: grand challenge and work-
shop summary. In Proceedings of the 15th ACM on
International conference on multimodal interaction,
pages 365–368.
Geiger, A., Lenz, P., Stiller, C., and Urtasun, R. (2013).
Vision meets robotics: The kitti dataset. The Inter-
national Journal of Robotics Research, 32(11):1231–
1237.
Geiger, A., Lenz, P., and Urtasun, R. (2012). Are we ready
for autonomous driving? the kitti vision benchmark
suite. In 2012 IEEE conference on computer vision
and pattern recognition, pages 3354–3361. IEEE.
Handa, A., Whelan, T., McDonald, J., and Davison, A. J.
(2014). A benchmark for rgb-d visual odometry, 3d
reconstruction and slam. In 2014 IEEE international
conference on Robotics and automation (ICRA), pages
1524–1531. IEEE.
Kim, T.-K., Wong, S.-F., and Cipolla, R. (2007). Tensor
canonical correlation analysis for action classification.
In 2007 IEEE Conference on Computer Vision and
Pattern Recognition, pages 1–8. IEEE.
Latif, G., Mohammad, N., Alghazo, J., AlKhalaf, R., and
AlKhalaf, R. (2019). Arasl: Arabic alphabets sign
language dataset. Data in brief, 23:103777.
Liu, L. and Shao, L. (2013). Learning discriminative repre-
sentations from rgb-d video data. In Twenty-third in-
ternational joint conference on artificial intelligence.
Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., and
Kautz, J. (2016). Online detection and classification
of dynamic hand gestures with recurrent 3d convolu-
tional neural network. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 4207–4215.
Mur-Artal, R. and Tard
´
os, J. D. (2017). Orb-slam2:
An open-source slam system for monocular, stereo,
and rgb-d cameras. IEEE transactions on robotics,
33(5):1255–1262.
Nuzzi, C., Pasinetti, S., Pagani, R., Coffetti, G., and San-
soni, G. (2021). Hands: an rgb-d dataset of static
hand-gestures for human-robot interaction. Data in
Brief, 35:106791.
Pfrommer, B., Sanket, N., Daniilidis, K., and Cleve-
land, J. (2017). Penncosyvio: A challenging vi-
sual inertial odometry benchmark. In 2017 IEEE In-
ternational Conference on Robotics and Automation
(ICRA), pages 3847–3854.
Ramezani, M., Wang, Y., Camurri, M., Wisth, D., Mat-
tamala, M., and Fallon, M. (2020). The newer col-
lege dataset: Handheld lidar, inertial and vision with
ground truth. In 2020 IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS), pages
4353–4360. IEEE.
Runz, M., Buffier, M., and Agapito, L. (2018). Maskfu-
sion: Real-time recognition, tracking and reconstruc-
tion of multiple moving objects. In 2018 IEEE Inter-
national Symposium on Mixed and Augmented Reality
(ISMAR), pages 10–20. IEEE.
Schubert, D., Goll, T., Demmel, N., Usenko, V., St
¨
uckler,
J., and Cremers, D. (2018). The tum vi bench-
mark for evaluating visual-inertial odometry. In
2018 IEEE/RSJ International Conference on Intelli-
gent Robots and Systems (IROS), pages 1680–1687.
Smith, M., Baldwin, I., Churchill, W., Paul, R., and New-
man, P. (2009). The new college vision and laser data
set. The International Journal of Robotics Research,
28(5):595–599.
Starner, T., Weaver, J., and Pentland, A. (1998). Real-
time american sign language recognition using desk
and wearable computer based video. IEEE Trans-
actions on pattern analysis and machine intelligence,
20(12):1371–1375.
Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cre-
mers, D. (2012). A benchmark for the evaluation of
rgb-d slam systems. In 2012 IEEE/RSJ international
conference on intelligent robots and systems, pages
573–580. IEEE.
Wang, W., Zhu, D., Wang, X., Hu, Y., Qiu, Y., Wang,
C., Hu, Y., Kapoor, A., and Scherer, S. (2020). Tar-
tanair: A dataset to push the limits of visual slam. In
2020 IEEE/RSJ International Conference on Intelli-
gent Robots and Systems (IROS), pages 4909–4916.
IEEE.
Whelan, T., Leutenegger, S., Salas-Moreno, R., Glocker,
B., and Davison, A. (2015). Elasticfusion: Dense slam
without a pose graph. Robotics: Science and Systems.
Yan, H., Shan, Q., and Furukawa, Y. (2018). Ridi: Robust
imu double integration. In Proceedings of the Euro-
pean Conference on Computer Vision (ECCV), pages
621–636.
Zhang, L., Camurri, M., and Fallon, M. (2021). Multi-
camera lidar inertial extension to the newer college
dataset. arXiv preprint arXiv:2112.08854.
Zhang, Y., Cao, C., Cheng, J., and Lu, H. (2018). Egoges-
ture: A new dataset and benchmark for egocentric
hand gesture recognition. IEEE Transactions on Mul-
timedia, 20(5):1038–1050.
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
1344