BRINGING ORDER IN THE BAG OF WORDS

Shihong Zhang, Rahat Khan, Damien Muselet, Alain Trémeau

Abstract

This paper presents a method to infuse spatial information in the bag of words (BOW) framework for object categorization. The main idea is to account the local spatial distribution of the visual words. Rather than finding rigid local patterns, we consider the visual words in close spatial proximity as a pouch of words and we represent the image as a bag of word-pouches. For this purpose, sub-windows are extracted from the images and characterized by local bags of words. Then a clustering step is applied in the local bag of words space to construct the word-pouches. We show that this representation is complementary to the classical BOW. Thus a concatenation of these two representations is used as the final descriptor. Experiments are conducted on two very well known image datasets.

References

  1. Bhatti, N. A. and Hanbury, A. (2010). Co-occurrence bag of words for object recognition. In Proceedings of the 15th Computer Vision Winter Workshop (CVWW).
  2. Csurka, G., Dance, C. R., Fan, L., Willamowski, J., and Bray, C. (2004). Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV.
  3. Fei-fei, L. (2004). Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In Workshop on Generative-Model Based Vision, CVPR.
  4. Grauman, K. and Darrell, T. (2005). The pyramid match kernel: discriminative classification with sets of image features. In International Conference of Computer Vision, pages 1458-1465.
  5. Lazebnik, S., Schmid, C., and Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Computer Vision and Pattern Recognition (CVPR).
  6. Lowe, D. G. (1999). Object recognition from local scaleinvariant features. In International Conference on Computer Vision (ICCV).
  7. Opelt, A., Fussenegger, M., Pinz, A., and Auer, P. (2004). Weak hypotheses and boosting for generic object detection and recognition. In Pajdla, T. and Matas, J., editors, ECCV (2), volume 3022 of Lecture Notes in Computer Science, pages 71-84. Springer.
  8. Sivic, J., Russell, B. C., Efros, A. A., Zisserman, A., and Freeman, W. T. (2005). Discovering objects and their location in images. In IEEE Intl. Conf. on Computer Vision.
  9. Yuan, J., Wu, Y., and Yang, M. (2007). Discovery of collocation patterns: from visual words to visual phrases. In Computer Vision and Pattern Recognition (CVPR).
  10. Zhang, E. and Mayo, M. (2008). Pattern discovery for object categorization. In 23rd International Conference Image and Vision Computing New Zealand 2008(IVCNZ 2008).
Download


Paper Citation


in Harvard Style

Zhang S., Khan R., Muselet D. and Trémeau A. (2012). BRINGING ORDER IN THE BAG OF WORDS . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012) ISBN 978-989-8565-03-7, pages 723-726. DOI: 10.5220/0003859307230726


in Bibtex Style

@conference{visapp12,
author={Shihong Zhang and Rahat Khan and Damien Muselet and Alain Trémeau},
title={BRINGING ORDER IN THE BAG OF WORDS},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012)},
year={2012},
pages={723-726},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003859307230726},
isbn={978-989-8565-03-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012)
TI - BRINGING ORDER IN THE BAG OF WORDS
SN - 978-989-8565-03-7
AU - Zhang S.
AU - Khan R.
AU - Muselet D.
AU - Trémeau A.
PY - 2012
SP - 723
EP - 726
DO - 10.5220/0003859307230726