Christoph Göring, Björn Fröhlich, Joachim Denzler


This work analyzes how to utilize the power of the popular GrabCut algorithm for the task of pixel-wise labeling of images, which is also known as semantic segmentation and an important step for scene understanding in various application domains. In contrast to the original GrabCut, the aim of the presented methods is to segment objects in images in a completely automatic manner and label them as one of the previously learned object categories. In this paper, we introduce and analyze two different approaches that extend GrabCut to make use of training images. C-GrabCut generates multiple class-specific segmentations and classifies them by using shape and color information. L-GrabCut uses as a first step an object localization algorithm, which returns a classified bounding box as a hypothesis of an object in the image. Afterwards, this hypothesis is used as an initialization for the GrabCut algorithm. In our experiments, we show that both methods lead to similar results and demonstrate their benefits compared to semantic segmentation methods only based on local features.


  1. Alexe, B., Deselaers, T., and Ferrari, V. (2010). Classcut for unsupervised class segmentation. In ECCV, pages 380-393.
  2. Belongie, S., Malik, J., and Puzicha, J. (2002). Shape matching and object recognition using shape contexts. PAMI, 24(4):509-522.
  3. Boykov, Y., Veksler, O., and Zabih, R. (2001). Fast approximate energy minimization via graph cuts. PAMI, 23:1222-1239.
  4. Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR, volume 1, pages 886-893.
  5. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2010). The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results.
  6. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. PAMI, 32:1627-1645.
  7. Goldberger, J., Gordon, S., and Gordon, S. (2003). An efficient image similarity measure based on approximations of KL-divergence between two gaussian mixtures. In ICCV, pages 487-493.
  8. Gonzalez, R. C. and Woods, R. E. (2008). Digital image processing. Prentice Hall, Upper Saddle River, N.J.
  9. Han, S., Tao, W., Wang, D., Tai, X.-C., and Wu, X. (2009). Image segmentation based on GrabCut framework integrating multiscale nonlinear structure tensor. IEEE Trans. on Image Processing, 18(10):2289-2302.
  10. Jahangiri, M. and Heesch, D. (2009). Modified grabcut for unsupervised object segmentation. In ICIP, pages 2389-2392.
  11. Marszalek, M. and Schmid, C. (2007). Accurate object localization with shape masks. In CVPR.
  12. Rother, C., Kolmogorov, V., and Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans. on Graphics (TOG), 23(3):309-314.
  13. Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. (2008). Labelme: A database and web-based tool for image annotation. IJCV, 77:157-173.
  14. Winn, J., Criminsi, A., and Minka, T. (2004). Microsoft research cambridge object recognition image database.

Paper Citation

in Harvard Style

Göring C., Fröhlich B. and Denzler J. (2012). SEMANTIC SEGMENTATION USING GRABCUT . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012) ISBN 978-989-8565-03-7, pages 597-602. DOI: 10.5220/0003829905970602

in Bibtex Style

author={Christoph Göring and Björn Fröhlich and Joachim Denzler},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012)},

in EndNote Style

JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012)
SN - 978-989-8565-03-7
AU - Göring C.
AU - Fröhlich B.
AU - Denzler J.
PY - 2012
SP - 597
EP - 602
DO - 10.5220/0003829905970602