LEARNING A VISUAL ATTENTION MODEL FOR ADAPTIVE FAST-FORWARD IN VIDEO SURVEILLANCE

Benjamin Höferlin, Hermann Pflüger, Markus Höferlin, Gunther Heidemann, Daniel Weiskopf

2012

Abstract

The focus of visual attention is guided by salient signals in the peripheral field of view (bottom-up) as well as by the relevance feedback of a semantic model (top-down). As a result, humans are able to evaluate new situations very fast, with only a view numbers of fixations. In this paper, we present a learned model for the fast prediction of visual attention in video. We consider bottom-up and memory-less top-down mechanisms of visual attention guidance, and apply the model to video playback-speed adaption. The presented visual attention model is based on rectangle features that are fast to compute and capable of describing the known mechanisms of bottom-up processing, such as motion, contrast, color, symmetry, and others as well as topdown cues, such as face and person detectors. We show that the visual attention model outperforms other recent methods in adaption of video playback-speed.

References

  1. Cheng, K., Luo, S., Chen, B., and Chu, H. (2009). Smartplayer: user-centric video fast-forwarding. In Proceedings of the International Conference on Human Factors in Computing Systems (CHI), pages 789-798. ACM New York.
  2. Davis, J., Morison, A., and Woods, D. (2007). An adaptive focus-of-attention model for video surveillance and monitoring. Machine Vision and Applications, 18:41- 64.
  3. Höferlin, B., Höferlin, M., Weiskopf, D., and Heidemann, G. (2011). Information-based adaptive fast-forward for visual surveillance. Multimedia Tools and Applications, 55(1):127-150.
  4. Itti, L. (2005). Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition, 12(6):1093-1123.
  5. Itti, L. and Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2(3):194-203.
  6. Itti, L., Koch, C., and Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254-1259.
  7. Jasso, H. and Triesch, J. (2007). Learning to attend - from bottom-up to top-down. In Paletta, L. and Rome, E., editors, Attention in Cognitive Systems. Theories and Systems from an Interdisciplinary Viewpoint, volume 4840 of Lecture Notes in Computer Science, pages 106-122. Springer Berlin / Heidelberg.
  8. Judd, T., Ehinger, K., Durand, F., and Torralba, A. (2009). Learning to predict where humans look. In International Conference on Computer Vision, pages 2106- 2113. IEEE.
  9. Kienzle, W., Schölkopf, B., Wichmann, F., and Franz, M. (2007). How to find interesting locations in video: a spatiotemporal interest point detector learned from human eye movements. In Proceedings of the DAGM Conference on Pattern Recognition, pages 405-414. Springer.
  10. Nataraju, S., Balasubramanian, V., and Panchanathan, S. (2009). Learning attention based saliency in videos from human eye movements. In Workshop on Motion and Video Computing (WMVC), pages 1-6. IEEE.
  11. Peker, K. and Divakaran, A. (2004). Adaptive fast playback-based video skimming using a compresseddomain visual complexity measure. In International Conference on Multimedia and Expo, volume 3, pages 2055-2058.
  12. Peters, R. and Itti, L. (2007). Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention. In In Proceedings of Computer Vision and Pattern Recognition (CVPR), pages 1-8. IEEE.
  13. Petrovic, N., Jojic, N., and Huang, T. (2005). Adaptive video fast forward. Multimedia Tools and Applications, 26(3):327-344.
  14. Viola, P. and Jones, M. J. (2001). Robust real-time object detection. Technical Report CRL 2001/01, Cambridge Research Laboratory.
  15. Wolfe, J. (1994). Guided search 2.0 a revised model of visual search. Psychonomic Bulletin & Review, 1(2):202-238.
  16. Zhao, Q. and Koch, C. (2011). Learning visual saliency. In In Proceedings of the Annual Conference on Information Sciences and Systems (CISS), pages 1-6.
Download


Paper Citation


in Harvard Style

Höferlin B., Pflüger H., Höferlin M., Heidemann G. and Weiskopf D. (2012). LEARNING A VISUAL ATTENTION MODEL FOR ADAPTIVE FAST-FORWARD IN VIDEO SURVEILLANCE . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-8425-99-7, pages 25-32. DOI: 10.5220/0003720000250032


in Bibtex Style

@conference{icpram12,
author={Benjamin Höferlin and Hermann Pflüger and Markus Höferlin and Gunther Heidemann and Daniel Weiskopf},
title={LEARNING A VISUAL ATTENTION MODEL FOR ADAPTIVE FAST-FORWARD IN VIDEO SURVEILLANCE},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2012},
pages={25-32},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003720000250032},
isbn={978-989-8425-99-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - LEARNING A VISUAL ATTENTION MODEL FOR ADAPTIVE FAST-FORWARD IN VIDEO SURVEILLANCE
SN - 978-989-8425-99-7
AU - Höferlin B.
AU - Pflüger H.
AU - Höferlin M.
AU - Heidemann G.
AU - Weiskopf D.
PY - 2012
SP - 25
EP - 32
DO - 10.5220/0003720000250032