Monocular Depth Ordering using Perceptual Occlusion Cues
Babak Rezaeirowshan, Coloma Ballester, Gloria Haro
2016
Abstract
In this paper we propose a method to estimate a global depth order between the objects of a scene using information from a single image coming from an uncalibrated camera. The method we present stems from early vision cues such as occlusion and convexity and uses them to infer both a local and a global depth order. Monocular occlusion cues, namely, T-junctions and convexities, contain information suggesting a local depth order between neighbouring objects. A combination of these cues is more suitable, because, while information conveyed by T-junctions is perceptually stronger, they are not as prevalent as convexity cues in natural images. We propose a novel convexity detector that also establishes a local depth order. The partial order is extracted in T-junctions by using a curvature-based multi-scale feature. Finally, a global depth order, i.e., a full order of all shapes that is as consistent as possible with the computed partial orders that can tolerate conflicting partial orders is computed. An integration scheme based on a Markov chain approximation of the rank aggregation problem is used for this purpose. The experiments conducted show that the proposed method compares favorably with the state of the art.
References
- Basha, T., Moses, Y., and Avidan, S. (2012). Photo sequencing. In Computer Vision-ECCV 2012, pages 654-667. Springer.
- Burge, J., Fowlkes, C., and Banks, M. (2010). Naturalscene statistics predict how the figure-ground cue of convexity affects human depth perception. The Journal of Neuroscience, 30(21):7269-7280.
- Calderero, F. and Caselles, V. (2013). Recovering Relative Depth from Low-Level Features Without Explicit Tjunction Detection and Interpretation. International Journal of Computer Vision, 104:38-68.
- maps and local contrast changes in natural images. International Journal of Computer Vision, 33(1):5-27.
- Chen, X., Li, Q., Zhao, D., and Zhao, Q. (2013). Occlusion cues for image scene layering. Computer Vision and Image Understanding, 117(1):42-55.
- Dimiccoli, M., Morel, J.-M., and Salembier, P. (2008). Monocular depth by nonlinear diffusion. In Computer Vision, Graphics & Image Processing, 2008. ICVGIP'08. Sixth Indian Conference on, pages 95- 102. IEEE.
- Dimiccoli, M. and Salembier, P. (2009a). Exploiting t-junctions for depth segregation in single images. Acoustics, Speech and Signal . . . , pages 1229-1232.
- Dimiccoli, M. and Salembier, P. (2009b). region-based representation for segmentation and filtering with depth in single images. Image Processing (ICIP), 2009 16th, 1:3533-3536.
- Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. (2001). Rank aggregation methods for the web. In Proceedings of the 10th international conference on World Wide Web, pages 613-622. ACM.
- Eigen, D., Puhrsch, C., and Fergus, R. (2014). Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems, pages 2366-2374.
- Esedoglu, S. and March, R. (2003). Segmentation with depth but without detecting junctions. Journal of Mathematical Imaging and Vision, 18(1):7-15.
- Gao, R., Wu, T., Zhu, S., and Sang, N. (2007). Bayesian inference for layer representation with mixed markov random field. Energy Minimization Methods in Computer Vision and Pattern Recognition, 4679:213-224.
- Guzmán, A. (1968). Decomposition of a visual scene into three-dimensional bodies. In Proceedings of the December 9-11, 1968, fall joint computer conference, part I, pages 291-304. ACM.
- Hoiem, D., Efros, A., and Hebert, M. (2011). Recovering occlusion boundaries from an image. International Journal of Computer Vision, 91(3):328-346.
- Jia, Z., Gallagher, A., Chang, Y., and Chen, T. (2012). A learning-based framework for depth ordering. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 294-301. IEEE.
- Kanizsa, G. (1979). Organization in vision: essays on Gestalt perception. NY, Praeger.
- Liu, B., Gould, S., and Koller, D. (2010). Single image depth estimation from predicted semantic labels. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 1253-1260. IEEE.
- Malik, J. (1987). Interpreting line drawings of curved objects. International Journal of Computer Vision, 73403.
- Marr, D. (1982). Vision: A computational approach. San Francisco: Free-man & Co.
- Martin, D., Fowlkes, C., Tal, D., and Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proc. 8th Int'l Conf. Computer Vision, volume 2, pages 416-423.
- Matheron, G. (1968). Modèle séquentiel de partition aléatoire. Technical report, CMM.
- McDermott, J. (2004). Psychophysics with junctions in real images. Perception, 33(9):1101-1127.
- Nitzberg, M. and Mumford, D. (1990). The 2.1-d sketch. In Computer Vision, 1990. Proceedings, Third International Conference on, pages 138-144. IEEE.
- Nitzberg, M., Mumford, D., and Shiota, T. (1993). Filtering, segmentation, and depth, volume 662. Lecture notes in computer science, Springer.
- Palou, G. and Salembier, P. (2011). Occlusion-based depth ordering on monocular images with binary partition tree. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pages 1093-1096. IEEE.
- Palou, G. and Salembier, P. (2013). Monocular depth ordering using t-junctions and convexity occlusion cues. IEEE Transactions on Image Processing, 22(5):1926- 1939.
- Pao, H., Geiger, D., and Rubin, N. (1999). Measuring convexity for figure/ground separation. In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, volume 2, pages 948-955. IEEE.
- Rubin, N. (2001). Figure and ground in the brain. Nature Neuroscience, 4:857-858.
- Santner, J., Pock, T., and Bischof, H. (2010). Interactive multi-label segmentation. In Proceedings 10th Asian Conference on Computer Vision (ACCV), Queenstown, New Zealand.
- Saxena, A., Chung, S. H., and Ng, A. Y. (2008). 3-d depth reconstruction from a single still image. International journal of computer vision, 76(1):53-69.
- Serra, J. (1986). Introduction to mathematical morphology, volume 35(3). Elsevier.
- Zeng, Q., Chen, W., Wang, H., Tu, C., Cohen-Or, D., Lischinski, D., and Chen, B. (2015). Hallucinating stereoscopy from a single image. In Computer Graphics Forum, volume 34, pages 1-12. Wiley Online Library.
Paper Citation
in Harvard Style
Rezaeirowshan B., Ballester C. and Haro G. (2016). Monocular Depth Ordering using Perceptual Occlusion Cues . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 431-441. DOI: 10.5220/0005726404310441
in Bibtex Style
@conference{visapp16,
author={Babak Rezaeirowshan and Coloma Ballester and Gloria Haro},
title={Monocular Depth Ordering using Perceptual Occlusion Cues},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={431-441},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005726404310441},
isbn={978-989-758-175-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - Monocular Depth Ordering using Perceptual Occlusion Cues
SN - 978-989-758-175-5
AU - Rezaeirowshan B.
AU - Ballester C.
AU - Haro G.
PY - 2016
SP - 431
EP - 441
DO - 10.5220/0005726404310441