suite. In Conference on Computer Vision and Pattern
Recognition (CVPR).
Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., and Ben-
namoun, M. (2019). Deep learning for 3d point
clouds: A survey. arXiv preprint arXiv:1912.12033.
Hartley, R. and Zisserman, A. (2003). Multiple view geom-
etry in computer vision. In Multiple view geometry in
computer vision. Cambridge university press.
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A.,
and Brox, T. (2017). Flownet 2.0: Evolution of optical
flow estimation with deep networks. In Proceedings of
the IEEE conference on computer vision and pattern
recognition.
Ioffe, S. and Szegedy, C. (2015). Batch normalization: Ac-
celerating deep network training by reducing internal
covariate shift. In Proceedings of the 32nd Interna-
tional Conference on Machine Learning, Proceedings
of Machine Learning Research. PMLR.
Kendall, A., Grimes, M., and Cipolla, R. (2015). Posenet: A
convolutional network for real-time 6-dof camera re-
localization. In Proceedings of the IEEE international
conference on computer vision.
Kitt, B., Geiger, A., and Lategahn, H. (2010). Visual odom-
etry based on stereo image sequences with ransac-
based outlier rejection scheme. In 2010 ieee intelli-
gent vehicles symposium. IEEE.
Li, Q., Chen, S., Wang, C., Li, X., Wen, C., Cheng, M., and
Li, J. (2019). Lo-net: Deep real-time lidar odometry.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition.
Liu, X., Qi, C. R., and Guibas, L. J. (2019). Flownet3d:
Learning scene flow in 3d point clouds. In Proceed-
ings of the IEEE Conference on Computer Vision and
Pattern Recognition.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision.
Mur-Artal, R., Montiel, J. M. M., and Tardos, J. D. (2015).
Orb-slam: a versatile and accurate monocular slam
system. IEEE Transactions on Robotics.
Nowruzi, F. E., Japkowicz, N., and Laganiere, R. (2017).
Homography estimation from image pairs with hier-
archical convolutional networks. In Computer Vision
Workshop (ICCVW), 2017 IEEE International Confer-
ence on. IEEE.
Qi, C. R., Su, H., Mo, K., and Guibas, L. J. (2017a). Point-
net: Deep learning on point sets for 3d classification
and segmentation. In Proceedings of the IEEE confer-
ence on computer vision and pattern recognition.
Qi, C. R., Yi, L., Su, H., and Guibas, L. J. (2017b). Point-
net++: Deep hierarchical feature learning on point sets
in a metric space. In Advances in neural information
processing systems.
Revaud, J., Weinzaepfel, P., Harchaoui, Z., and Schmid,
C. (2016). Deepmatching: Hierarchical deformable
dense matching. International Journal of Computer
Vision, 120(3).
Rublee, E., Rabaud, V., Konolige, K., and Bradski, G.
(2011). Orb: An efficient alternative to sift or surf.
In Computer Vision (ICCV), 2011 IEEE international
conference on.
Sattler, T., Maddern, W., Toft, C., Torii, A., Hammarstrand,
L., Stenborg, E., Safari, D., Okutomi, M., Pollefeys,
M., Sivic, J., et al. (2018). Benchmarking 6dof out-
door visual localization in changing conditions. In
Proc. CVPR, volume 1.
Shrivastava, A., Gupta, A., and Girshick, R. (2016). Train-
ing region-based object detectors with online hard ex-
ample mining. In Proceedings of the IEEE conference
on computer vision and pattern recognition.
Simonyan, K. and Zisserman, A. (2015). Very deep con-
volutional networks for large-scale image recognition.
In International Conference on Learning Representa-
tions.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wo-
jna, Z. (2016). Rethinking the inception architecture
for computer vision. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition.
Ushani, A. K., Wolcott, R. W., Walls, J. M., and Eustice,
R. M. (2017). A learning approach for real-time tem-
poral scene flow estimation from lidar data. In 2017
IEEE International Conference on Robotics and Au-
tomation (ICRA). IEEE.
Wang, S., Clark, R., Wen, H., and Trigoni, N. (2017).
Deepvo: Towards end-to-end visual odometry with
deep recurrent convolutional neural networks. CoRR.
Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M.,
and Solomon, J. M. (2019). Dynamic graph cnn
for learning on point clouds. ACM Transactions on
Graphics (TOG).
Wu, W., Qi, Z., and Fuxin, L. (2019). Pointconv: Deep
convolutional networks on 3d point clouds. In Pro-
ceedings of the IEEE Conference on Computer Vision
and Pattern Recognition.
Zhang, J. and Singh, S. (2014). Loam: Lidar odometry
and mapping in real-time. In Robotics: Science and
Systems.
Zhou, T., Brown, M., Snavely, N., and Lowe, D. G. (2017).
Unsupervised learning of depth and ego-motion from
video. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition.
Zhou, Y. and Tuzel, O. (2018). Voxelnet: End-to-end learn-
ing for point cloud based 3d object detection. In Pro-
ceedings of the IEEE Conference on Computer Vision
and Pattern Recognition.
Point Cloud based Hierarchical Deep Odometry Estimation
121