Enhanced Hierarchical Conditional Random Field Model for Semantic Image Segmentation
Li-Li Wang, Shan-Shan Zhu, N. H. C. Yung
2014
Abstract
Pairwise and higher order potentials in the Hierarchical Conditional Random Field (HCRF) model play a vital role in smoothing region boundary and extracting actual object contour in the labeling space. However, pairwise potential evaluated by color information has the tendency to over-smooth small regions which are similar to their neighbors in the color space; and the higher order potential associated with multiple segments is prone to produce incorrect guidance to inference, especially for objects having similar features to the background. To overcome these problems, this paper proposes two enhanced potentials in the HCRF model that is capable to abate the over smoothness by propagating the believed labeling from the unary potential and to perform coherent inference by ensuring reliable segment consistency. Experimental results on the MSRC-21 data set demonstrate that the enhanced HCRF model achieves pleasant visual results, as well as significant improvement in terms of both global accuracy of 87.52% and average accuracy of 80.18%, which outperforms other algorithms reported in the literature so far.
References
- Boix, Xavier, Gonfaus, Josep M, van de Weijer, Joost, Bagdanov, Andrew D, Serrat, Joan, & Gonzàlez, Jordi. (2012). Harmony potentials. International journal of computer vision, 96(1), 83-102.
- Boros, Endre, & Hammer, Peter L. (2002). Pseudoboolean optimization. Discrete applied mathematics, 123(1), 155-225.
- Boykov, Yuri, Veksler, Olga, & Zabih, Ramin. (2001). Fast approximate energy minimization via graph cuts. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(11), 1222-1239.
- Boykov, Yuri Y, & Jolly, M-P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. Paper presented at the Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on.
- Comaniciu, Dorin, & Meer, Peter. (2002). Mean shift: A robust approach toward feature space analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(5), 603-619.
- Felzenszwalb, Pedro F, & Huttenlocher, Daniel P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167-181.
- He, Xuming, Zemel, Richard S, & Carreira-Perpinán, Miguel A. (2004). Multiscale conditional random fields for image labeling. Paper presented at the Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on.
- Kohli, Pushmeet, Kumar, M Pawan, & Torr, Philip HS. (2009). P³ & Beyond: Move Making Algorithms for Solving Higher Order Functions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(9), 1645-1656.
- Kohli, Pushmeet, & Torr, Philip HS. (2009). Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 82(3), 302-324.
- Kumar, Sanjiv, & Hebert, Martial. (2005). A hierarchical field framework for unified context-based classification. Paper presented at the Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on.
- Kumar, Sanjiv, & Hebert, Martial. (2006). Discriminative random fields. International Journal of Computer Vision, 68(2), 179-201.
- Ladicky, Lubor, Russell, Chris, Kohli, Pushmeet, & Torr, Philip HS. (2009). Associative hierarchical crfs for object class image segmentation. Paper presented at the Computer Vision, 2009 IEEE 12th International Conference on.
- Ladický, Lubor, Russell, Chris, Kohli, Pushmeet, & Torr, Philip HS. (2012). Inference Methods for CRFs with Co-occurrence Statistics. International Journal of Computer Vision, 1-13.
- Lafferty, John, McCallum, Andrew, & Pereira, Fernando CN. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Paper presented at the Proceedings of Machine Learning.
- MacQueen, James. (1967). Some methods for classification and analysis of multivariate observations. Paper presented at the Proceedings of the fifth Berkeley symposium on mathematical statistics and probability.
- Plath, Nils, Toussaint, Marc, & Nakajima, Shinichi. (2009). Multi-class image segmentation using conditional random fields and global classification. Paper presented at the Proceedings of the 26th Annual International Conference on Machine Learning.
- Rother, Carsten, Kohli, Pushmeet, Feng, Wei, & Jia, Jiaya. (2009). Minimizing sparse higher order energy functions of discrete variables. Paper presented at the Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on.
- Shi, Jianbo, & Malik, Jitendra. (2000). Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8), 888-905.
- Shotton, Jamie, Winn, John, Rother, Carsten, & Criminisi, Antonio. (2006). Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. Paper presented at the Computer Vision-ECCV 2006.
- Szummer, Martin, Kohli, Pushmeet, & Hoiem, Derek. (2008). Learning CRFs using graph cuts. Paper presented at the Computer Vision-ECCV 2008.
- Tan, Zhigang, & Yung, Nelson HC. (2008). Image segmentation towards natural clusters. Paper presented at the Pattern Recognition, 2008. ICPR 2008. 19th International Conference on.
- Torralba, Antonio, Murphy, Kevin P, & Freeman, William T. (2004). Sharing features: efficient boosting procedures for multiclass object detection. Paper presented at the Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on.
- Zhu, Shan-shan, & Yung, Nelson HC. (2011). Sub-scene generation: A step towards complex scene understanding. Paper presented at the Multimedia and Expo (ICME), 2011 IEEE International Conference on.
Paper Citation
in Harvard Style
Wang L., Zhu S. and Yung N. (2014). Enhanced Hierarchical Conditional Random Field Model for Semantic Image Segmentation . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 215-222. DOI: 10.5220/0004649202150222
in Bibtex Style
@conference{visapp14,
author={Li-Li Wang and Shan-Shan Zhu and N. H. C. Yung},
title={Enhanced Hierarchical Conditional Random Field Model for Semantic Image Segmentation},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={215-222},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004649202150222},
isbn={978-989-758-004-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Enhanced Hierarchical Conditional Random Field Model for Semantic Image Segmentation
SN - 978-989-758-004-8
AU - Wang L.
AU - Zhu S.
AU - Yung N.
PY - 2014
SP - 215
EP - 222
DO - 10.5220/0004649202150222