A Generic Probabilistic Graphical Model for Region-based Scene Interpretation

Michael Ying Yang

Abstract

The task of semantic scene interpretation is to label the regions of an image and their relations into meaningful classes. Such task is a key ingredient to many computer vision applications, including object recognition, 3D reconstruction and robotic perception. The images of man-made scenes exhibit strong contextual dependencies in the form of the spatial and hierarchical structures. Modeling these structures is central for such interpretation task. Graphical models provide a consistent framework for the statistical modeling. Bayesian networks and random fields are two popular types of the graphical models, which are frequently used for capturing such contextual information. Our key contribution is the development of a generic statistical graphical model for scene interpretation, which seamlessly integrates different types of the image features, and the spatial structural information and the hierarchical structural information defined over the multi-scale image segmentation. It unifies the ideas of existing approaches, e. g. conditional random field and Bayesian network, which has a clear statistical interpretation as the MAP estimate of a multi-class labeling problem. We demonstrate experimentally the application of the proposed graphical model on the task of multi-class classification of building facade image regions.

References

  1. Besag, J. 1974. Spatial interaction and the statistical analysis of lattice systems (with discussion). Journal of the royal statistical society, B-36(2), 192-236.
  2. Comaniciu, Dorin, & Meer, Peter. 2002. Mean shift: A robust approach toward feature space analysis. Ieee transactions on pattern analysis and machine intelligence, 24(5), 603-619.
  3. Drauschke, M., & F örstner, W. 2011. A bayesian approach for scene interpretation with integrated hierarchical structure. Pages 1-10 of: Annual symposium of the german association for pattern recognition (dagm).
  4. Fulkerson, B., Vedaldi, A., & Soatto, S. 2009. Class segmentation and object localization with superpixel neighborhoods. Pages 670-677 of: International conference on computer vision.
  5. Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. 2008. Multi-class segmentation with relative location prior. International journal of computer vision, 80(3), 300-316.
  6. He, X., Zemel, R., & Carreira-perpin, M. 2004. Multiscale conditional random fields for image labeling. Pages 695-702 of: Ieee conference on computer vision and pattern recognition.
  7. Korc?, Filip, & Förstner, Wolfgang. 2009. eTRIMS Image Database for interpreting images of man-made scenes. In: Tr-igg-p-2009-01, department of photogrammetry, university of bonn.
  8. Kumar, Sanjiv, & Hebert, Martial. 2003a. Discriminative random fields: A discriminative framework for contextual interaction in classification. Pages 1150-1157 of: Ieee international conference on computer vision, vol. 2.
  9. Kumar, Sanjiv, & Hebert, Martial. 2003b. Man-made structure detection in natural images using a causal multiscale random field. Pages 119-126 of: Ieee conference on computer vision and pattern recognition.
  10. Lafferty, J., McCallum, A., & Pereira, F. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Pages 282-289 of: International conference on machine learning.
  11. Modestino, J. W., & Zhang, J. 1992. A markov random field model-based approach to image interpretation. Ieee transactions on pattern analysis and machine intelligence, 14(6), 606-615.
  12. Mortensen, Eric N., & Jia, Jin. 2006. Real-time semiautomatic segmentation using a bayesian network. Pages 1007-1014 of: Ieee conference on computer vision and pattern recognition.
  13. Plath, Nils, Toussaint, Marc, & Nakajima, Shinichi. 2009. Multi-class image segmentation using conditional random fields and global classification. Pages 817-824 of: Bottou, Léon, & Littman, Michael (eds), International conference on machine learning.
  14. Reynolds, J., & Murphy, K. 2007. Figure-ground segmentation using a hierarchical conditional random field. Pages 175-182 of: Canadian conference on computer and robot vision.
  15. Sarkar, S., & Boyer, K. L. 1993. Integration, inference, and management of spatial information using bayesian networks: Perceptual organization. Pami, 15, 256- 274.
  16. Schnitzspan, P., Fritz, M., & Schiele, B. 2008. Hierarchical support vector random fields: Joint training to combine local and global features. Pages 527-540 of: Forsyth, D., Torr, P., & Zisserman, A. (eds), European conference on computer vision.
  17. Tsotsos, J.K. 1988. A 'complexity level' analysis of immediate vision. International journal of computer vision, 2(1), 303-320.
  18. Vincent, Luc, & Soille, Pierre. 1991. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. Ieee transactions on pattern analysis and machine intelligence, 13(6), 583-598.
  19. Yang, Michael Ying, Förstner, Wolfgang, & Drauschke, Martin. 2010. Hierarchical conditional random field for multi-class image classification. Pages 464-469 of: International conference on computer vision theory and applications.
  20. Zhang, Lei, & Ji, Qiang. 2010. Image segmentation with a unified graphical model. Pami, 32(8), 1406-1425.
Download


Paper Citation


in Harvard Style

Yang M. (2015). A Generic Probabilistic Graphical Model for Region-based Scene Interpretation . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 486-491. DOI: 10.5220/0005341004860491


in Bibtex Style

@conference{visapp15,
author={Michael Ying Yang},
title={A Generic Probabilistic Graphical Model for Region-based Scene Interpretation},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={486-491},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005341004860491},
isbn={978-989-758-090-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - A Generic Probabilistic Graphical Model for Region-based Scene Interpretation
SN - 978-989-758-090-1
AU - Yang M.
PY - 2015
SP - 486
EP - 491
DO - 10.5220/0005341004860491