Spatio-temporal Video Retrieval by Animated Sketching

Steven Verstockt, Olivier Janssens, Sofie Van Hoecke, Rik Van de Walle


In order to improve content-based searching in digital video, this paper proposes a novel intuitive querying method based on animated sketching. By sketching two or more frames of the desired scene, users can intuitively find the video sequences they are looking for. To find the best match for the user input, the proposed algorithm generates the edge histogram descriptors of both the sketches’ static background and its moving foreground objects. Based on these spatial descriptors, the set of videos is queried a first time to find video sequences in which similar background and foreground objects appear. This spatial filtering already results in sequences with similar scene characteristics as the sketch. However, further temporal analysis is needed to find the sequences in which the specific action, i.e. the sketched animation, occurs. This is done by matching the motion descriptors of the motion history images of the sketch and the video sequences. The sequences with the highest match are returned to the user. Experiments on a heterogeneous set of videos demonstrate that the system allows more intuitive video retrieval and yields appropriate query results, which match the sketches.


  1. Aslandogan, A. Y. and Yu, C. T. (1999). Techniques and systems for image and video retrieval. IEEE Transactions on knowledge and data engineering, 11:56-63.
  2. Blank, M., Gorelick, L., Shechtman, E., Irani, M., and Basri, R. (2005). Actions as space-time shapes. International Conference on Computer Vision, pages 1395-1402.
  3. Bobick, A. F. and Davis, J. W. (2001). The recognition of human movement using temporal templates. Transactions on Pattern Analysis and Machine Intelligence, 23:257-267.
  4. Brahmi, D. and Ziou, D. (2004). Improving cbir systems by integrating semantic features. Canadian Conference on Computer and Robot Vision, pages 233-240.
  5. Chang, S. F., Chen, W., and Sundaram, H. (1998). Videoq: a fully automated video retrieval system using motion sketches. IEEE Workshop on Applications of Computer Vision.
  6. Collomosse, J., McNeill, G., and Qian, Y. (2009). Storyboard sketches for content based video retrieval. International Conference on Computer Vision.
  7. Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. (2009). A descriptor for large scale image retrieval based on sketched feature lines. Eurographics Symposium on Sketch-Based Interfaces and Modeling, pages 29-38.
  8. Laptev, I., M., M. M., Schmid, C., and Rozenfeld, B. (2008). Learning realistic human actions from movies. Computer Vision and Pattern Recognition, pages 1-8.
  9. Lee, A. J. T., Hong, R. W., and Chang, M. F. (2004). An approach to content-based video retrieval. International Conference on Multimedia and Expo, pages 273-276.
  10. Otsu, N. (1979). A threshold selection method from graylevel histograms. Transactions on Systems, Man and Cybernetics, 9:62-66.
  11. Petkovic, M. and Jonker, W. (2004). Content-Based Video Retrieval: A Database Perspective, volume 1. Kluwer Academic Publishers, Norwell, MA, 1st edition.
  12. Sclaroff, S., Cascia, M. L., Sethi, S., and Taycher, L. (1999). Unifying textual and visual cues for content-based image retrieval on the world wide web. Vision and Image Understanding, 75:86-89.
  13. Sikora, T. (2001). The mpeg-7 visual standard for content description - an overview. Transactions on Circuits and Systems for Video Technology, 11:696-702.
  14. Suma, E. A., Sinclair, C. W., Babbs, J., and Souvenir, R. (2008). A sketch-based approach for detecting common human actions. International Symposium on Visual Computing, pages 418-427.
  15. Tomasi, C. and Manduchi, R. (1998). Bilateral filtering for gray and color images. International Conference on Computer Vision, pages 839-846.
  16. Yang, C. C. (2004). Content-based image retrieval: a comparison between query by example and image browsing map approaches. Journal of Information Science, 30:254-267.

Paper Citation

in Harvard Style

Verstockt S., Janssens O., Hoecke S. and Walle R. (2013). Spatio-temporal Video Retrieval by Animated Sketching . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 723-728. DOI: 10.5220/0004341607230728

in Bibtex Style

author={Steven Verstockt and Olivier Janssens and Sofie Van Hoecke and Rik Van de Walle},
title={Spatio-temporal Video Retrieval by Animated Sketching},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},

in EndNote Style

JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI - Spatio-temporal Video Retrieval by Animated Sketching
SN - 978-989-8565-47-1
AU - Verstockt S.
AU - Janssens O.
AU - Hoecke S.
AU - Walle R.
PY - 2013
SP - 723
EP - 728
DO - 10.5220/0004341607230728