Processing Systems, pages 2366–2374.
Garg, R., BG, V. K., Carneiro, G., and Reid, I. (2016). Un-
supervised CNN for Single View Depth Estimation:
Geometry to the Rescue. In European Conference on
Computer Vision, pages 740–756. Springer.
Godard, C., Mac Aodha, O., and Brostow, G. J. (2017).
Unsupervised Monocular Depth Estimation with Left-
Right Consistency. In IEEE Conference on Computer
Vision and Pattern Recognition, pages 270–279.
Gordon, A., Li, H., Jonschkowski, R., and Angelova, A.
(2019). Depth from Videos in the Wild: Unsupervised
Monocular Depth Learning from Unknown Cameras.
arXiv preprint arXiv:1904.04998.
Lee, M. and Fowlkes, C. C. (2019). CeMNet: Self-
Supervised Learning for Accurate Continuous Ego-
motion Estimation. In IEEE Conference on Computer
Vision and Pattern Recognition Workshops, pages 0–
0.
Li, R., Wang, S., Long, Z., and Gu, D. (2018). Undeepvo:
Monocular Visual Odometry through Unsupervised
Deep Learning. In IEEE International Conference on
Robotics and Automation, pages 7286–7291. IEEE.
Liu, S., Johns, E., and Davison, A. J. (2019). End-to-End
Multi-Task Learning with Attention. In IEEE Con-
ference on Computer Vision and Pattern Recognition,
pages 1871–1880.
Luo, C., Yang, Z., Wang, P., Wang, Y., Xu, W., Nevatia, R.,
and Yuille, A. (2018). Every Pixel Counts++: Joint
Learning of Geometry and Motion with 3D Holistic
Understanding. arXiv preprint arXiv:1810.06125.
Mahjourian, R., Wicke, M., and Angelova, A. (2018). Un-
supervised Learning of Depth and Ego-motion from
Monocular Video using 3D Geometric Constraints.
In IEEE Conference on Computer Vision and Pattern
Recognition, pages 5667–5675.
Misra, I., Shrivastava, A., Gupta, A., and Hebert, M. (2016).
Cross-Stitch Networks for Multi-Task Learning. In
IEEE Conference on Computer Vision and Pattern
Recognition, pages 3994–4003.
Pillai, S., Ambrus¸, R., and Gaidon, A. (2019). Superdepth:
Self-Supervised, Super-Resolved Monocular Depth
Estimation. In International Conference on Robotics
and Automation, pages 9250–9256. IEEE.
Ranjan, A., Jampani, V., Balles, L., Kim, K., Sun, D., Wulff,
J., and Black, M. J. (2019). Competitive Collabo-
ration: Joint Unsupervised Learning of Depth, Cam-
era Motion, Optical Flow and Motion Segmentation.
In IEEE Conference on Computer Vision and Pattern
Recognition, pages 12240–12249.
Ruder, S., Bingel, J., Augenstein, I., and Søgaard, A.
(2017). Latent Multi-Task Architecture Learning.
arXiv preprint arXiv:1705.08142.
Santos, A. and Pedrini, H. (2019). Spatio-Temporal Video
Autoencoder for Human Action Recognition. In 14th
International Joint Conference on Computer Vision,
Imaging and Computer Graphics Theory and Appli-
cations, pages 114–123, Prague, Czech Republic.
Souza, M., Fonseca, L., and Pedrini, H. (2018). Im-
provement of Global Motion Estimation in Two-
Dimensional Digital Video Stabilization Methods.
IET Image Processing, 12(12):2204–2211.
Tacon, H., Brito, A., Chaves, H., Vieira, M., Villela, S.,
Maia, H., Concha, D., and Pedrini, H. (2019). Hu-
man Action Recognition Using Convolutional Neural
Networks with Symmetric Time Extension of Visual
Rhythms. In 19th International Conference on Com-
putational Science and its Applications, pages 351–
366, Saint Petersburg, Russia.
Triggs, B., McLauchlan, P. F., Hartley, R. I., and Fitzgibbon,
A. W. (1999). Bundle Adjustment: A Modern Synthe-
sis. In International Workshop on Vision Algorithms,
pages 298–372. Springer.
Wang, C., Miguel Buenaposada, J., Zhu, R., and Lucey, S.
(2018). Learning Depth from Monocular Videos using
Direct Methods. In IEEE Conference on Computer
Vision and Pattern Recognition, pages 2022–2030.
Xu, H., Zheng, J., Cai, J., and Zhang, J. (2019). Region
Deformer Networks for Unsupervised Depth Estima-
tion from Unconstrained Monocular Videos. arXiv
preprint arXiv:1902.09907.
Yin, Z. and Shi, J. (2018). Geonet: Unsupervised Learn-
ing of Dense Depth, Optical Flow and Camera Pose.
In IEEE Conference on Computer Vision and Pattern
Recognition, pages 1983–1992.
Zhou, L., Ye, J., Abello, M., Wang, S., and Kaess,
M. (2018). Unsupervised Learning of Monoc-
ular Depth Estimation with Bundle Adjustment,
Super-Resolution and Clip Loss. arXiv preprint
arXiv:1812.03368.
Zhou, T., Brown, M., Snavely, N., and Lowe, D. G.
(2017). Unsupervised Learning of Depth and Ego-
Motion from Video. In IEEE Conference on Computer
Vision and Pattern Recognition, pages 1851–1858.
Zou, Y., Luo, Z., and Huang, J.-B. (2018). Df-Net: Un-
supervised Joint Learning of Depth and Flow using
Cross-Task Consistency. In European Conference on
Computer Vision, pages 36–53.
Self-supervised Depth Estimation based on Feature Sharing and Consistency Constraints
141