MOTION DESCRIPTORS FOR SEMANTIC VIDEO INDEXING

Markos Zampoglou; Theophilos Papadimitriou; Konstantinos I. Diamantaras

doi:10.5220/0002942601780184

MOTION DESCRIPTORS FOR SEMANTIC VIDEO INDEXING

Markos Zampoglou, Theophilos Papadimitriou, Konstantinos I. Diamantaras

2010

Abstract

Content-based video indexing is a field of rising interest that has achieved significant progress in the recent years. However, it can be retrospectively observed that, while many powerful spatial descriptors have so far been developed, the potential of motion information for the extraction of semantic meaning has been largely left untapped. As part of our effort to automatically classify the archives of a local TV station, we developed a number of motion descriptors aimed at providing meaningful distinctions between different semantic classes. In this paper, we present two descriptors we have used in our past work, combined with a novel motion descriptor inspired by the highly successful Bag-Of-Features methods. We demonstrate the ability of such descriptors to boost classifier performance compared to the exclusive use of spatial features, and discuss the potential formation of even more efficient descriptors for video motion patterns.

References

Bovik, A. C., Clark, M., Geisler, W. S. (1990). Multichannel Texture Analysis Using Localized Spatial Filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 55-73.
Canny, J. (1986) A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 679-698.
Cao, J., Lan, Y., Li, J., Li, Q., Li, X., Lin, F., Liu, X., Luo, L., Peng, W., Wang, D., Wang, H., Wang, Z., Xiang, Z., Yuan, J., Zheng, W., Zhang, B., Zhang, J., Zhang, L., Zhang, X. (2006). Intelligent Multimedia Group of Tsinghua University at TRECVID 2006. In Proc. 2006 NIST TREC Video Retrieval Evaluation Workshop.
Hao, S., Yoshizawa, Y., Yamasaki, K., Shinoda, K., Furui, S. (2008). Tokyo Tech at TRECVID 2008. In Proc. 2008 NIST TREC Video Retrieval Evaluation Workshop.
Huang, J., Kumar, S. R., Mitra, M., Zhu, W. J., Zabih, R. (1999). Spatial Color Indexing and Applications. International Journal of Computer Vision, 35, 245- 268.
Jain, A. K., Vailaya, A. (1996). Image retrieval using color and shape. Pattern Recognition, 29, 1233-1244.
Jurie, F., Triggs, B. (2005). Creating Efficient Codebooks for Visual Recognition, Proc. ICCV 7805, 10th IEEE International Conference on Computer Vision, 1, 604- 610 .
Liu, A., Tang, S., Zhang, Y., Song, Y., Li, J., Yang, Z. (2007). TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS. In Proc. 2007 NIST TREC Video Retrieval Evaluation Workshop.
Lowe, D. G. (2004) Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60, 91-110.
Ma, Y. F., Zhang, H. J. (2001). A new perceived motion based shot content representation. In Proc. ICIP 7801, 8th IEEE International Conference on Image Processing, 3, 426-429.
Mikolajczyk, K., Schmid, C. (2005) A Performance Evaluation of Local Descriptors, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1615-1630
Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T., Van Gool, L. (2005). Modeling scenes with local descriptors and latent aspects. In Proc. ICCV 7805, 10th IEEE International Conference on Computer Vision, 1, 883-890.
Striker, M. Orengo, M. (1995). Similarity of Color Images. In Proc. SPIE 7895, Storage and Retrieval for Image and Video Databases, 2420, 381-392.
Swain, M. J., Ballard, D. H. (1991). Color Indexing, International. Journal of Computer Vision, 7, 11-32.
Tan, Y.-P., Saur, D. D., Kulkarni, S. R., Ramadge, P. J. (2000). Rapid estimation of camera motion from compressed video with application to video annotation. IEEE Transactions on Circuits and Systems for Video Technology, 10, 133-146.
Zampoglou M., Papadimitriou T., Diamantaras, K. I, (2007). Support Vector Machines Content-Based Video Retrieval Based Solely on Motion Information. In Proc. MLSP 7807, 17th IEEE Workshop on Machine Learning for Signal Processing, 176-180.
Zampoglou, M., Papadimitriou, T., Diamantaras K. I. (2008). From Low-Level Features to Semantic Classes: Spatial and Temporal Descriptors for Video Indexing. Journal of Signal Processing Systems, OnlineFirst, doi: 10.1007/s11265-008-0314-3.
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C. (2007). Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision, 73, 213-238.

Download

Paper Citation

in Harvard Style

Zampoglou M., Papadimitriou T. and I. Diamantaras K. (2010). MOTION DESCRIPTORS FOR SEMANTIC VIDEO INDEXING . In Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010) ISBN 978-989-8425-19-5, pages 178-184. DOI: 10.5220/0002942601780184

in Bibtex Style

@conference{sigmap10,
author={Markos Zampoglou and Theophilos Papadimitriou and Konstantinos I. Diamantaras},
title={MOTION DESCRIPTORS FOR SEMANTIC VIDEO INDEXING},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010)},
year={2010},
pages={178-184},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002942601780184},
isbn={978-989-8425-19-5},
}

in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010)
TI - MOTION DESCRIPTORS FOR SEMANTIC VIDEO INDEXING
SN - 978-989-8425-19-5
AU - Zampoglou M.
AU - Papadimitriou T.
AU - I. Diamantaras K.
PY - 2010
SP - 178
EP - 184
DO - 10.5220/0002942601780184