
Engel, J., St
¨
uckler, J., and Cremers, D. (2015). Large-scale
direct slam with stereo cameras. In 2015 IEEE/RSJ
international conference on intelligent robots and sys-
tems (IROS), pages 1935–1942. IEEE.
Fujimoto, S. and Matsunaga, N. (2023). Deep feature-based
rgb-d odometry using superpoint and superglue. Pro-
cedia Computer Science, 227:1127–1134.
Han, X., Tao, Y., Li, Z., Cen, R., and Xue, F. (2020). Super-
pointvo: A lightweight visual odometry based on cnn
feature extraction. In 2020 5th International Confer-
ence on Automation, Control and Robotics Engineer-
ing (CACRE), pages 685–691. IEEE.
Handa, A., Whelan, T., McDonald, J., and Davison, A. J.
(2014). A benchmark for rgb-d visual odometry, 3d
reconstruction and slam. In 2014 IEEE international
conference on Robotics and automation (ICRA), pages
1524–1531. IEEE.
Hou, X., Zhao, H., Wang, C., and Liu, H. (2022). Knowl-
edge driven indoor object-goal navigation aid for vi-
sually impaired people. Cognitive Computation and
Systems, 4(4):329–339.
Jin, L., Zhang, H., and Ye, C. (2021). A wearable robotic
device for assistive navigation and object manipula-
tion. In 2021 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pages 765–
770. IEEE.
Jin, S., Ahmed, M. U., Kim, J. W., Kim, Y. H., and Rhee,
P. K. (2020). Combining obstacle avoidance and vi-
sual simultaneous localization and mapping for indoor
navigation. Symmetry, 12(1):119.
Li, Y., Yunus, R., Brasch, N., Navab, N., and Tombari, F.
(2021). Rgb-d slam with structural regularities. In
2021 IEEE international conference on Robotics and
automation (ICRA), pages 11581–11587. IEEE.
Lindenberger, P., Sarlin, P.-E., and Pollefeys, M. (2023).
Lightglue: Local feature matching at light speed. In
Proceedings of the IEEE/CVF International Confer-
ence on Computer Vision, pages 17627–17638.
Ming, Y., Ye, W., and Calway, A. (2022). idf-slam: End-to-
end rgb-d slam with neural implicit mapping and deep
feature tracking. arXiv preprint arXiv:2209.07919.
Mo, J., Islam, M. J., and Sattar, J. (2021). Fast direct stereo
visual slam. IEEE Robotics and Automation Letters,
7(2):778–785.
Mollica, G., Legittimo, M., Dionigi, A., Costante, G., and
Valigi, P. (2023). Integrating sparse learning-based
feature detectors into simultaneous localization and
mapping—a benchmark study. Sensors, 23(4):2286.
Mur-Artal, R., Montiel, J. M. M., and Tardos, J. D. (2015).
Orb-slam: a versatile and accurate monocular slam
system. IEEE transactions on robotics, 31(5):1147–
1163.
Plikynas, D., Indriulionis, A., Laukaitis, A., and
Sakalauskas, L. (2022). Indoor-guided navigation for
people who are blind: Crowdsourcing for route map-
ping and assistance. Applied Sciences, 12(1):523.
Qin, T., Li, P., and Shen, S. (2018). Vins-mono: A robust
and versatile monocular visual-inertial state estimator.
IEEE Transactions on Robotics, 34(4):1004–1020.
Son, H. and Weiland, J. (2022). Wearable system to guide
crosswalk navigation for people with visual impair-
ment. Frontiers in Electronics, 2:790081.
Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cre-
mers, D. (2012). A benchmark for the evaluation of
rgb-d slam systems. In 2012 IEEE/RSJ international
conference on intelligent robots and systems, pages
573–580. IEEE.
Sumikura, S., Shibuya, M., and Sakurada, K. (2019). Open-
vslam: A versatile visual slam framework. In Pro-
ceedings of the 27th ACM International Conference
on Multimedia, pages 2292–2295.
Teed, Z. and Deng, J. (2021). Droid-slam: Deep visual slam
for monocular, stereo, and rgb-d cameras. Advances
in neural information processing systems, 34:16558–
16569.
Teed, Z., Lipson, L., and Deng, J. (2024). Deep patch visual
odometry. Advances in Neural Information Process-
ing Systems, 36.
Wang, S., Clark, R., Wen, H., and Trigoni, N. (2017).
Deepvo: Towards end-to-end visual odometry with
deep recurrent convolutional neural networks. In 2017
IEEE international conference on robotics and au-
tomation (ICRA), pages 2043–2050. IEEE.
Wang, W., Hu, Y., and Scherer, S. (2021). Tartanvo: A gen-
eralizable learning-based vo. In Conference on Robot
Learning, pages 1761–1772. PMLR.
Wang, W., Zhu, D., Wang, X., Hu, Y., Qiu, Y., Wang,
C., Hu, Y., Kapoor, A., and Scherer, S. (2020). Tar-
tanair: A dataset to push the limits of visual slam. In
2020 IEEE/RSJ International Conference on Intelli-
gent Robots and Systems (IROS), pages 4909–4916.
IEEE.
Whelan, T., Kaess, M., Johannsson, H., Fallon, M.,
Leonard, J. J., and McDonald, J. (2015). Real-time
large-scale dense rgb-d slam with volumetric fusion.
The International Journal of Robotics Research, 34(4-
5):598–626.
Zhang, H. and Ye, C. (2017). An indoor wayfinding sys-
tem based on geometric features aided graph slam for
the visually impaired. IEEE Transactions on Neural
Systems and Rehabilitation Engineering, 25(9):1592–
1604.
Zhang, X., Yao, X., Zhu, Y., and Hu, F. (2019). An arcore
based user centric assistive navigation system for vi-
sually impaired people. Applied Sciences, 9(5):989.
Zhao, W., Liu, S., Shu, Y., and Liu, Y.-J. (2020). Towards
better generalization: Joint depth-pose learning with-
out posenet. In Proceedings of the IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition,
pages 9151–9161.
Zhu, B., Yu, A., Hou, B., Li, G., and Zhang, Y. (2023). A
novel visual slam based on multiple deep neural net-
works. Applied Sciences, 13(17):9630.
Deep Learning-Powered Visual SLAM Aimed at Assisting Visually Impaired Navigation
765