IMPROVING PERSON DETECTION IN VIDEOS BY AUTOMATIC SCENE ADAPTATION

Roland Möerzinger, Marcus Thaler

Abstract

The task of object detection in videos can be improved by taking advantage of the continuity in the data stream, e.g. by object tracking. If tracking is not possible due to missing motion features, low frame rate, severe occlusions or rapid appearance changes, then a detector is typically applied in each frame of the video separately. In this case the run-time performance is impaired by exhaustively searching each frame at numerous locations and multiple scales. However, it is still possible to significantly improve the detector's performance if a static camera and a single planar ground plane can be assumed, which is the case in many surveillance scenarios. Our work addresses this issue by automatically adapting a detector to the specific yet unknown planar scene. In particular, during the adaptation phase robust statistics about few detections are used for estimating the appropriate scales of the detection windows at each location. Experiments with an existing person detector based on histograms of oriented gradients show that the scene adaptation leads to an improvement of both computational performance and detection accuracy. For scene specific person detection, changes to the implementation of the existing detector were made. The code is available for download. Results on benchmark datasets (9 videos from i-LIDS and PETS) demonstrate the applicability of our approach.

References

  1. Breitenstein, M. D., Sommerlade, E., Leibe, B., van Gool, L., and Reid, I. (2008). Probabilistic parameter selection for learning scene structure from video. In BMVC.
  2. Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR.
  3. Dollár, P., Wojek, C., Schiele, B., and Perona, P. (2009). Pedestrian detection: A benchmark. In CVPR.
  4. Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM.
  5. Greenhill, D., Renno, J., Orwell, J., and Jones, G. A. (2008). Occlusion analysis: Learning and utilising depth maps in object tracking. Image Vision Computing.
  6. Hoiem, D., Efros, A. A., and Hebert, M. (2006). Putting objects in perspective. In CVPR.
  7. Renno, J. R., Orwell, J., and Jones, G. A. (2002). Learning surveillance tracking models for the self-calibrated ground plane. In BMVC.
  8. Stalder, S., Grabner, H., and van Gool, L. (2009). Exploring context to learn scene specific object detectors. In Performance Evaluation of Tracking and Surveillance workshop at CVPR.
  9. UK Home Office (2008). i-LIDS multiple camera tracking scenario definition.
  10. Zhu, L., Zhou, J., Song, J., Yan, Z., and Gu, Q. (2008). A practical algorithm for learning scene information from monocular video. Optics Express.
Download


Paper Citation


in Harvard Style

Möerzinger R. and Thaler M. (2010). IMPROVING PERSON DETECTION IN VIDEOS BY AUTOMATIC SCENE ADAPTATION . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2010) ISBN 978-989-674-029-0, pages 333-338. DOI: 10.5220/0002820203330338


in Bibtex Style

@conference{visapp10,
author={Roland Möerzinger and Marcus Thaler},
title={IMPROVING PERSON DETECTION IN VIDEOS BY AUTOMATIC SCENE ADAPTATION},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2010)},
year={2010},
pages={333-338},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002820203330338},
isbn={978-989-674-029-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2010)
TI - IMPROVING PERSON DETECTION IN VIDEOS BY AUTOMATIC SCENE ADAPTATION
SN - 978-989-674-029-0
AU - Möerzinger R.
AU - Thaler M.
PY - 2010
SP - 333
EP - 338
DO - 10.5220/0002820203330338