Combining Holistic Descriptors for Scene Classification

Kelly Assis de Souza Gazolli, Evandro Ottoni Teatini Salles

Abstract

Scene classification is an important issue in the field of computer vision. To face this problem we explore in this paper a combination of Holistic Descriptors to scene categorization task. Therefore, we first describe the Contextual Mean Census Transform (CMCT), an image descriptor that combines distribution of local structures with contextual information. CMCT is a holistic descriptor based on CENTRIST and, as CENTRIST, encodes the structural properties within an image and suppresses detailed textural information. Second, we present the GistCMTC, a combination of Contextual Mean Census Transform descriptor with Gist in order to generate a new holistic descriptor representing scenes more accurately. Experimental results on four used datasets demonstrate that the proposed methods could achieve competitive performance against previous methods.

References

  1. Chang, C.-C. and Lin, C.-J. (2011). Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):1-27.
  2. Fei-Fei, L. and Perona, P. (2005). A bayesian hierarchical model for learning natural scene categories. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2, pages 524-531. IEEE Computer Society.
  3. Fröba, B. and Ernst, A. (2004). Face detection with the modified census transform. In Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition, pages 91-96. IEEE Computer Society.
  4. Grauman, K. and Darrell, T. (2005). The pyramid match kernel: Discriminative classification with sets of image features. In Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2, ICCV 7805, pages 1458-1465. IEEE Computer Society.
  5. Lazebnik, S., Schmid, C., and Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2, CVPR 7806, pages 2169-2178. IEEE Computer Society.
  6. Li, L.-J. and Fei-Fei, L. (2007). What, where and who? classifying events by scene and object recognition. In IEEE 11th International Conference on Computer Vision, pages 1-8. IEEE Computer Society.
  7. Liu, S., Xu, D., and Feng, S. (2011). Region contextual visual words for scene categorization. Expert Systems with Applications, 38(9):11591-11597.
  8. Lowe, D. G. (1999). Object recognition from local scaleinvariant features. In Proceedings of the International Conference on Computer Vision - Volume 2, ICCV 7899, pages 1150-1157. IEEE Computer Society.
  9. Oliva, A. and Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision, 42(3):145-175.
  10. Oliva, A. and Torralba, A. (2006). Building the gist of a scene: The role of global image features in recognition. Progress in Brain Research, 155:23-36.
  11. Qin, J. and Yung, N. (2010). via contextual visual words. 43(5):1874-1888.
  12. Scene categorization Pattern Recognition, Quattoni, A. and Torralba, A. (2009). Recognizing indoor scenes. In Proceedings IEEE CS Conf. Computer Vision and Pattern Recognition, pages 413-420. IEEE Computer Society.
  13. Salton, G. and McGill, M. J. (1983). Introducrion to modern information retrieval. New York: McGraw-Hill.
  14. Vapnik, V. (1998). The support vector method of function estimation. Nonlinear Modeling advanced blackbox techniques Suykens JAK Vandewalle J Eds, pages 55- 85.
  15. Wei Liu, S. K. and Gabbouj, M. (2012). Robust scene classification by gist with angular radial partitioning. In Communications Control and Signal Processing (ISCCSP), 2012 5th International Symposium on, pages 2-4.
  16. Wu, J. and Rehg, J. M. (2009). Beyond the euclidean distance : Creating effective visual codebooks using the histogram intersection kernel. In Computer Vision, 2009 IEEE 12th International Conference on, pages 630-637. IEEE Computer Society.
  17. Wu, J. and Rehg, J. M. (2011). Centrist: A visual descriptor for scene categorization. IEEE Trans. Pattern Anal. Mach. Intell., 33(8):1489-1501.
  18. Zabih, R. and Woodfill, J. (1994). Non-parametric local transforms for computing visual correspondence. In Proceedings of the third European conference on Computer Vision - Volume 2, ECCV 7894, pages 151- 158. Springer-Verlag New York, Inc.
Download


Paper Citation


in Harvard Style

Gazolli K. and Salles E. (2013). Combining Holistic Descriptors for Scene Classification . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 315-320. DOI: 10.5220/0004286103150320


in Bibtex Style

@conference{visapp13,
author={Kelly Assis de Souza Gazolli and Evandro Ottoni Teatini Salles},
title={Combining Holistic Descriptors for Scene Classification},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},
year={2013},
pages={315-320},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004286103150320},
isbn={978-989-8565-47-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI - Combining Holistic Descriptors for Scene Classification
SN - 978-989-8565-47-1
AU - Gazolli K.
AU - Salles E.
PY - 2013
SP - 315
EP - 320
DO - 10.5220/0004286103150320