Dynamic Scene Recognition based on Improved Visual Vocabulary Model

Lin Yan-Hao; Lu-Fang GAO

doi:10.5220/0004736105570565

Dynamic Scene Recognition based on Improved Visual Vocabulary Model

Lin Yan-Hao, Lu-Fang GAO

2014

Abstract

In this paper, we present a scene recognition framework, which could process the images and recognize the scene in the images. We demonstrate and evaluate the performance of our system on a dataset of Oxford typical landmarks. In this paper, we put forward a novel method of local k-meriod for building a vocabulary and introduce a novel quantization method of soft-assignment based on the Gaussian mixture model. Then we also introduced the Gaussian model in order to classify the images into different scenes by calculating the probability of whether an image belongs to the scene , and we further improve the model by drawing out the consistent features and filtering out the noise features. Our experiment proves that these methods actually improve the classifying performance.

References

Amit, Y. and Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9(7):1545-1588.
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145-1159.
Chum, O., Philbin, J., Sivic, J., Isard, M., and Zisserman, A. (2007). Total recall: Automatic query expansion with a generative feature model for object retrieval. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pages 1-8. IEEE.
Elkan, C. (2003). Using the triangle inequality to accelerate k-means. In ICML, volume 3, pages 147-153.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(9):1627-1645.
Johns, E. and Yang, G.-Z. (2011a). From images to scenes: Compressing an image cluster into a single scene model for place recognition. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 874-881. IEEE.
Johns, E. and Yang, G.-Z. (2011b). Place recognition and online learning in dynamic scenes with spatiotemporal landmarks. In BMVC, pages 1-12.
Lepetit, V., Lagger, P., and Fua, P. (2005). Randomized trees for real-time keypoint recognition. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 2, pages 775-781. IEEE.
Li, F. and Kosecka, J. (2006). Probabilistic location recognition using reduced feature set. In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages 3405-3410. IEEE.
Lowe, D. G. (1999). Object recognition from local scaleinvariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on, volume 2, pages 1150-1157. Ieee.
Mikolajczyk, K., Leibe, B., and Schiele, B. (2006). Multiple object class detection with a generative model. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, volume 1, pages 26-36. IEEE.
Mikolajczyk, K. and Schmid, C. (2005). A performance evaluation of local descriptors. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(10):1615-1630.
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., and Van Gool, L. (2005). A comparison of affine region detectors. International journal of computer vision, 65(1-2):43- 72.
Moosmann, F., Triggs, W., and Jurie, F. (2006). Randomized clustering forests for building fast and discriminative visual vocabularies.
Narzt, W., Pomberger, G., Ferscha, A., Kolb, D., Müller, R., Wieghardt, J., Hörtner, H., and Lindinger, C. (2006). Augmented reality navigation systems. Universal Access in the Information Society, 4(3):177-187.
Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007a). Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pages 1-8. IEEE.
Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007b). Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pages 1-8. IEEE.
Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2008). Lost in quantization: Improving particular object retrieval in large scale image databases. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8. IEEE.
Sivic, J. and Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pages 1470-1477. Ieee.

Download

Paper Citation

in Harvard Style

Yan-Hao L. and GAO L. (2014). Dynamic Scene Recognition based on Improved Visual Vocabulary Model . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 557-565. DOI: 10.5220/0004736105570565

in Bibtex Style

@conference{visapp14,
author={Lin Yan-Hao and Lu-Fang GAO},
title={Dynamic Scene Recognition based on Improved Visual Vocabulary Model},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={557-565},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004736105570565},
isbn={978-989-758-004-8},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Dynamic Scene Recognition based on Improved Visual Vocabulary Model
SN - 978-989-758-004-8
AU - Yan-Hao L.
AU - GAO L.
PY - 2014
SP - 557
EP - 565
DO - 10.5220/0004736105570565