Authors:
Kyle J. Cantrell
;
Craig D. Miller
and
Carlos W. Morato
Affiliation:
Department of Robotics Engineering, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA, U.S.A.
Keyword(s):
Autonomous Vehicles, Depth Estimation, Ensemble Neural Networks, Intelligent Transport Systems, Semantic Segmentation, U-Net, Vehicle Perception, VSLAM.
Abstract:
Knowledge of environmental depth is required for successful autonomous vehicle navigation and VSLAM. Current autonomous vehicles utilize range-finding solutions such as LIDAR, RADAR, and SONAR that suffer drawbacks in both cost and accuracy. Vision-based systems offer the promise of cost-effective, accurate, and passive depth estimation to compete with existing sensor technologies. Existing research has shown that it is possible to estimate depth from 2D monocular vision cameras using convolutional neural networks. Recent advances suggest that depth estimate accuracy can be improved when networks used for supplementary tasks such as semantic segmentation are incorporated into the network architecture. A novel Serial U-Net (NU-Net) architecture is introduced as a modular, ensembling technique for combining the learned features from N-many U-Nets into a single pixel-by-pixel output. Serial U-Nets are proposed to combine the benefits of semantic segmentation and transfer learning for im
proved depth estimation accuracy. The performance of Serial U-Net architectures are characterized by evaluation on the NYU Depth V2 benchmark dataset and by measuring depth inference times. Autonomous vehicle navigation can substantially benefit by leveraging the latest in depth estimation and deep learning.
(More)