Authors:
Yuki Saito
1
;
Hideo Saito
1
and
Vincent Frémont
2
Affiliations:
1
Faculty of Science and Technology, Keio University, Yokohama, Kanagawa, Japan
;
2
CNRS, LS2N, Nantes Universite, Ecole Central de Nantes, UMR 6004, F-44000 Nantes, France
Keyword(s):
Monocular Depth Estimation, Tilted Images, Gravity Prediction, Convolutional Neural Network.
Abstract:
Monocular depth estimation is a challenging task in computer vision. Although many approaches using Convolutional neural networks (CNNs) have been proposed, most of them are trained on large-scale datasets mainly composed of gravity-aligned images. Therefore, conventional approaches fail to predict reliable depth for tilted images containing large pitch and roll camera rotations. To tackle this problem, we propose a novel refining method based on the distribution of gravity directions in the training sets. We designed a gravity rectifier that is learned to transform the gravity direction of a tilted image into a rectified one that matches the gravity-aligned training data distribution. For the evaluation, we employed public datasets and also created our own dataset composed of large pitch and roll camera movements. Our experiments showed that our approach successfully rectified the camera rotation and outperformed our baselines, which achieved 29% im-provement in abs rel over the van
illa model. Additionally, our method had competitive accuracy comparable to state-of-the-art monocular depth prediction approaches considering camera rotation.
(More)