Towards Reliable Real-time Person Detection

Silviu-Tudor serban, Srinidhi Mukanahallipatna Simha, Vasanth Bathrinarayanan, Etienne Corvee, Francois Bremond


We propose a robust real-time person detection system, which aims to serve as solid foundation for developing solutions at an elevated level of reliability. Our belief is that clever handling of input data correlated with efficacious training algorithms are key for obtaining top performance. We introduce a comprehensive training method based on random sampling that compiles optimal classifiers with minimal bias and overfit rate. Building upon recent advances in multi-scale feature computations, our approach attains state-of-the-art accuracy while running at high frame rate.


  1. A. Leibe, E. S. and Schiele, B. (2005). Pedestrian detection in crowded scenes. CVPR.
  2. Agarwal, S. and Roth, D. (2002). Learning a sparse representation for object detection. ECCV.
  3. Antonio Torralba, A. A. E. (2011). Dataset Bias. CVPR.
  4. B. Babenko, P. Dollar, Z. T. and Belongie, S. (2008). Simultaneous learning and alignment: Multi-instance and multi-pose learning. ECCV.
  5. B. Froba, A. E. (2004). Face detection with the modified census transform. In Proc. of 6th Int. Conf. on Automatic Face and Gesture Recognition, pages 91-96.
  6. C. Wojek, S. W. and Schiele, B. (2009). Multi-cue onboard pedestrian detection. CVPR.
  7. C. Zhang, P. A. V. (2007). Multiple-Instance Pruning For Learning Efficient Cascade Detectors. NIPS.
  8. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine Learning.
  9. D. Park, D. R. and Fowlkes, C. (2010). Multiresolution models for objdetection. ECCV.
  10. Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. CVPR.
  11. G. Mori, S. B. and Malik, J. (2005). Efficient shape matching using shape contexts. TPAMI, pages 1832-1837.
  12. Gavrila, D. M. (2007). A bayesian, exemplar-based approach to hierarchical shape matching. TPAMI.
  13. Gavrila, D. M. and Philomin, V. (1999). Real-time object det. for smart vehicles. ICCV.
  14. M. Weber, M. W. and Perona, P. (2000). Unsupervised learning of models for recognision. ECCV.
  15. Mohan, C. P. and Poggio, T. (2001). Example-based object det. in images by components. TPAMI, 23, no. 4:349- 361.
  16. O. Tuzel, F. P. and Meer, P. (2008). Ped. det. via classification on riemannian manifolds. TPAMI, 30 no 10:1713-1727.
  17. P. Dollar, Z. Tu, H. T. and Belongie, S. (2007). Feature mining for image classification. CVPR.
  18. P. Dollar, Z. Tu, P. P. and Belongie, S. (2009). Integral channel features. BMVC.
  19. P. Dollar, R. A. and Kienzle, W. (2012). Crosstalk Cascades for Frame-Rate Pedestrian Detection. ECCV.
  20. P. Dollar, S. B. and Perona, P. (2010). The fastest pedestrian detector in the west. BMVC.
  21. P. Dollar, B. Babenko, S. B. P. P. and Z. Tu, M. (2008). Multiple component learning for object detection. ECCV.
  22. P. F. Felzenszwalb, R. B. Girshick, D. M. and Ramanan, D. (2009). Object detection with discriminatively trained part based models. TPAMI, 99.
  23. P. Felzenszwalb, D. M. and Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. CVPR.
  24. Papageorgiou, C. and Poggio, T. (2000). A trainable system for object detection. IJCV, 38:111-136.
  25. Q. Zhu, S. Avidan, M. Y. and Cheng, K. (2006). Fast human detection using a cascade of histograms of oriented gradients. CVPR.
  26. R. Fergus, P. P. and Zisserman, A. (2003). Object classMVA recognition by unsupervised scale-invariant learning. CVPR.
  27. Rodrigo Benenson, Markus Mathias, R. T. L. J. V. G. (2012). Pedestrian detection at 100 frames per second. CVPR.
  28. S. Maji, A. B. and Malik, J. (2008). Classification using intersection kernel SVMs is efficient. CVPR.
  29. S. Walk, K. S. and Schiele, B. (2010). Disparity statistics for pedestrian detection: Combining appearance, motion and stereo . ECCV.
  30. Sabzmeydani, P. and Mori, G. (2007). Detecting pedestrians by learning shapelet features. CVPR.
  31. T. Ojala, M. P. and Maenpaa, T. (2002). Multiresolution grayscale and rotation invariant texture classification with local binary patterns. TPAMI, 24 no. 7:971-987.
  32. Viola, P. A. and Jones, M. J. (2004). Robust real-time face detection. IJCV, 57 no. 2:137-154.
  33. Wojek, C. and Schiele, B. (2008). A performance evaluation of single and multi-feature people detection. DAGM.
  34. Wu, B. and Nevatia, R. (2005). Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detection. ICCV.
  35. Wu, B. and Nevatia, R. (2008). Optimizing discriminationefficiency tradeoff in integrating heterogeneous local features for object detection. CVPR.
  36. X. Tan, B. T. (2010). Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions. IEEE Transactions on Image Processing, 19(6):1635-1650.
  37. Z. Lin, G. H. and Davis, L. S. (2009). Multiple instance feature for robust part-based object detection. CVPR.

Paper Citation

in Harvard Style

serban S., Mukanahallipatna Simha S., Bathrinarayanan V., Corvee E. and Bremond F. (2014). Towards Reliable Real-time Person Detection . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 232-239. DOI: 10.5220/0004651302320239

in Bibtex Style

author={Silviu-Tudor serban and Srinidhi Mukanahallipatna Simha and Vasanth Bathrinarayanan and Etienne Corvee and Francois Bremond},
title={Towards Reliable Real-time Person Detection},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},

in EndNote Style

JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Towards Reliable Real-time Person Detection
SN - 978-989-758-004-8
AU - serban S.
AU - Mukanahallipatna Simha S.
AU - Bathrinarayanan V.
AU - Corvee E.
AU - Bremond F.
PY - 2014
SP - 232
EP - 239
DO - 10.5220/0004651302320239