People Detection in Fish-eye Top-views
Meltem Demirkus, Ling Wang, Michael Eschey, Herbert Kaestle, Fabio Galasso
2017
Abstract
Is the detection of people in top views any easier than from the much researched canonical fronto-parallel views (e.g. Caltech and INRIA pedestrian datasets)? We show that in both cases people appearance variability and false positives in the background limit performance. Additionally, we demonstrate that the use of fish-eye lenses further complicates the top-view people detection, since the person viewpoint ranges from nearly-frontal, at the periphery of the image, to perfect top-views, in the image center, where only the head and shoulder top profiles are visible. We contribute a new top-view fish-eye benchmark, we experiment with a state-of-the-art person detector (ACF) and evaluate approaches which balance less variability of appearance (grid of classifiers) with the available amount of data for training. Our results indicate the importance of data abundance over the model complexity and additionally stress the importance of an exact geometric understanding of the problem, which we also contribute here.
References
- Angelova, A., Krizhevsky, A., Vanhoucke, V., Ogale, A., and Ferguson, D. (2015). Real-time pedestrian detection with deep network cascades. In BMVC.
- Benenson, R., Omran, M., Hosang, J., , and Schiele, B. (2014). Ten years of pedestrian detection, what have we learned? In ECCV, CVRSUAD workshop.
- Cai, Z., Saberian, M., , and Vasconcelos, N. (2015). Learning complexity-aware cascades for deep pedestrian detection. In ICCV.
- Chiang, A.-T. and Wang, Y. (2014). Human detection in fish-eye images using hog-based detectors over rotated windows. In ICME Workshops.
- Corvee, E., Bak, S., and Bremond, F. (2012). People detection and re-identification for multi surveillance cameras. In VISAPP - International Conference on Computer Vision Theory and Applications -2012.
- Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. In CVPR.
- Dollar, P., Appel, R., Belongie, S., and Perona, P. (2014). Fast feature pyramids for object detection. TPAMI, 36(8):1532-1545.
- Dollár, P., Tu, Z., Perona, P., and Belongie, S. (2009). Integral channel features. In BMVC.
- Dollár, P., Wojek, C., Schiele, B., and Perona, P. (2009). Pedestrian detection: A benchmark. In CVPR.
- Dollár, P., Wojek, C., Schiele, B., and Perona, P. (2012). Pedestrian detection: An evaluation of the state of the art. TPAMI, 34.
- Drayer, B. and Brox, T. (2014). Training deformable object models for human detection based on alignment and clustering. In ECCV.
- Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. TPAMI, 32(9):1627-1645.
- Geiger, A., Lenz, P., and Urtasun, R. (2012). Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR.
- Hartley, R. and Zisserman, A. (2004). Multiple View Geometry in Computer Vision. Cambridge University Press.
- Hosang, J., Benenson, R., Omran, M., and Schiele, B. (2015). Taking a deeper look at pedestrians. In CVPR.
- Idrees, H., Soomro, K., and Shah, M. (2015). Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(10):1986-1998.
- Kannala, J. and Brandt, S. S. (2006). A generic camera model and calibration method for conventional, wideangle, and fish-eye lenses. IEEE Trans. Pattern Anal. Mach. Intell, 28:1335-1340.
- Kaur, R. and Singh, S. (2014). Background modelling, detection and tracking of human in video surveillance system. In Computational Intelligence on Power, Energy and Controls with their impact on Humanity (CIPECH), pages 54-58.
- Paul, M., Haque, S. M. E., and Chakraborty, S. (2013). Human detection in surveillance videos and its applications - a review. EURASIP Journal on Advances in Signal Processing, 2013(1):1-16.
- Puig, L., Berm údez, J., Sturm, P., and Guerrero, J. (2012). Calibration of omnidirectional cameras in practice: A comparison of methods. Comput. Vis. Image Underst., 116(1):120-137.
- Rodriguez, M., Ali, S., and Kanade, T. (2009). Tracking in unstructured crowded scenes. In ICCV.
- Rodriguez, M., Sivic, J., Laptev, I., and Audibert, J.-Y. (2011). Density-aware person detection and tracking in crowds. In ICCV.
- Rodriguez, M. D. and Shah, M. (2007). Detecting and segmenting humans in crowded scenes. In ACM Multimedia.
- Roth, P., Sternig, S., Grabner, H., and Bischof, H. (2009). Classifier grids for robust adaptive object detection. In CVPR.
- Sadeghi, M. A. and Forsyth, D. (2014). 30hz object detection with dpm v5. In European Conference on Computer Vision.
- Scaramuzza, D., Martinelli, A., and Siegwart, R. (2006). A flexible technique for accurate omnidirectional camera calibration and structure from motion. In International Conference on Computer Vision Systems (ICVS).
- Solera, F., Calderara, S., and Cucchiara, R. (2016). Socially constrained structural learning for groups detection in crowd. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(5):995-1008.
- Stauffer, C. and Grimson, W. (1999). Adaptive background mixture models for real-time tracking. In CVPR.
- Sternig, S., Roth, P. M., and Bischof, H. (2012). On-line inverse multiple instance boosting for classifier grids. Pattern Recogn. Lett., 33(7):890-897.
- Tang, S., Andriluka, M., and Schiele, B. (2014). Detection and tracking of occluded people. Int. J. Comput. Vision.
- Tasson, D., Montagnini, A., Marzotto, R., Farenzena, M., and Cristani, M. (2015). Fpga-based pedestrian detection under strong distortions. In CVPR Workshops.
- Tian, Y., Luo, P., Wang, X., , and Tang, X. (2015a). Deep learning strong parts for pedestrian detection. In ICCV.
- Tian, Y., Luo, P., Wang, X., and Tang, X. (2015b). Pedestrian detection aided by deep learning semantic tasks. In CVPR.
- Viola, P. and Jones, M. J. (2004). Robust real-time face detection. Int. J. Comput. Vision, 57(2):137-154.
- Yang, B., Yan, J., Lei, Z., and Li, S. Z. (2015). Convolutional channel features. In ICCV.
- Zhang, S., Benenson, R., Omran, M., Hosang, J., and Schiele, B. (2016). How far are we from solving pedestrian detection? In CVPR.
- Zhang, S., Benenson, R., and Schiele, B. (2015). Filtered channel features for pedestrian detection. In CVPR.
Paper Citation
in Harvard Style
Demirkus M., Wang L., Eschey M., Kaestle H. and Galasso F. (2017). People Detection in Fish-eye Top-views . In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, (VISIGRAPP 2017) ISBN 978-989-758-226-4, pages 141-148. DOI: 10.5220/0006094701410148
in Bibtex Style
@conference{visapp17,
author={Meltem Demirkus and Ling Wang and Michael Eschey and Herbert Kaestle and Fabio Galasso},
title={People Detection in Fish-eye Top-views},
booktitle={Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, (VISIGRAPP 2017)},
year={2017},
pages={141-148},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006094701410148},
isbn={978-989-758-226-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, (VISIGRAPP 2017)
TI - People Detection in Fish-eye Top-views
SN - 978-989-758-226-4
AU - Demirkus M.
AU - Wang L.
AU - Eschey M.
AU - Kaestle H.
AU - Galasso F.
PY - 2017
SP - 141
EP - 148
DO - 10.5220/0006094701410148