Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data
Stijn De Beugher, Geert Brône, Toon Goedemé
2015
Abstract
In this paper we present a highly accurate algorithm for the detection of human hands in real-life 2D image sequences. Current state of the art algorithms show relatively poor detection accuracy results on unconstrained, challenging images. To overcome this, we introduce a detection scheme in which we combine several well known detection techniques combined with an advanced elimination mechanism to reduce false detections. Furthermore we present a novel (semi-)automatic framework achieving detection rates up to 100%, with only minimal manual input. This is a useful tool in supervised applications where an error-free detection result is required at the cost of a limited amount of manual effort. As an application, this paper focuses on the analysis of video data of human-human interaction, collected with the scene camera of mobile eye-tracking glasses. This type of data is typically annotated manually for relevant features (e.g. visual fixations on gestures), which is a time-consuming, tedious and error-prone task. The usage of our semi-automatic approach reduces the amount of manual analysis dramatically. We also present a new fully annotated benchmark dataset on this application which we made publicly available.
References
- Al Moubayed, S., Edlund, J., and Gustafson, J. (2013). Analysis of gaze and speech patterns in three-party quiz game interaction. In Interspeech 2013.
- Bo, N., Dailey, M. N., and Uyyanonvara, B. (2007). Robust hand tracking in low-resolution video sequences. In Proc of the third conference on IASTED International Conference: Advances in Computer Science and Technology, pages 228-233, Anaheim, CA, USA.
- Broˆne, G. and Oben, B. (2014). Insight interaction. A multimodal and multifocal dialogue corpus. In Language Resources and Evaluation.
- Buehler, P., Everingham, M., Huttenlocher, D., and Zisserman, A. (2008). Long term arm and hand tracking for continuous sign language tv broadcasts. In Proceedings of the British Machine Vision Conference, pages 110.1-110.10. BMVA Press.
- Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR, pages 886- 893.
- De Beugher, S., Broˆne, G., and Goedemé, T. (2014). Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In Proc. of the 9th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.
- Dubout, C. and Fleuret, F. (2012). Exact acceleration of linear object detectors. In Proc. of the European Conference on Computer Vision (ECCV), pages 301-311.
- Eichner, M., Marin-Jimenez, M., Zisserman, A., and Ferrari, V. (2012). 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. International Journal of Computer Vision, 99:190- 214.
- Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627-1645.
- Gebre, B. G., Wittenburg, P., and Lenkiewicz, P. (2012). Towards automatic gesture stroke detection. In the Eighth International Conference on Language Resources and Evaluation, pages 231-235.
- Jokinen, K. (2010). Non-verbal signals for turn-taking and feedback. In Proc. of the Seventh International Conference on Language Resources and Evaluation.
- Kalman, R. (1960). A new approach to linear filtering and prediction problems. In Transaction of the ASME Journal of Basic Engineering, volume 82, pages 35- 45.
- Karlinsky, L., Dinerstein, M., Harari, D., and Ullman, S. (2010). The chains model for detecting parts by their context. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 25-32.
- Mittal, A., Zisserman, A., and Torr, P. (2011). Hand detection using multiple proposals. In Proc. of the British Machine Vision Conference, pages 75.1- 75.11. BMVA Press.
- N. A. Abdul Rahim, C. W. Kit, J. S. (2006). RGB-H-CbCr skin colour model for human face detection. In MMU International Symposium on Information and Communications Technologies (M2USIC), Petaling Jaya, Malaysia.
- Pfister, T., Charles, J., Everingham, M., and Zisserman, A. (2012). Automatic and efficient long term arm and hand tracking for continuous sign language TV broadcasts. In British Machine Vision Conference.
- Spruyt, V., Ledda, A., and Philips, W. (2013). Realtime, long-term hand tracking with unsupervised initialization. In Proceedings of the IEEE International Conference on Image Processing, pages 3730-3734. IEEE.
- Van den Bergh, M. and Van Gool, L. (2011). Combining rgb and tof cameras for real-time 3d hand gesture interaction. In Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV), WACV 7811, pages 66-72, Washington, DC, USA. IEEE Computer Society.
- Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. pages 511-518. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
- Wang, R. Y. and Popovic, J. (2009). Real-time handtracking with a color glove. In ACM SIGGRAPH 2009 Papers, pages 63:1-63:8.
- Williams, G., Bregler, C., Hackney, P., Rosenthal, S., Mcdowall, I., and Smolskiy, K. (2008). Body signature recognition.
- Wu, Y., Liu, Q., and Huang, T. S. (2000). An adaptive selforganizing color segmentation algorithm with application to robust real-time human hand localization. In in Proc. of Asian Conference on Computer Vision, pages 1106-1111.
- Yang, Y. and Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1385-1392. IEEE.
Paper Citation
in Harvard Style
De Beugher S., Brône G. and Goedemé T. (2015). Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 121-129. DOI: 10.5220/0005306601210129
in Bibtex Style
@conference{visapp15,
author={Stijn De Beugher and Geert Brône and Toon Goedemé},
title={Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={121-129},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005306601210129},
isbn={978-989-758-090-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data
SN - 978-989-758-090-1
AU - De Beugher S.
AU - Brône G.
AU - Goedemé T.
PY - 2015
SP - 121
EP - 129
DO - 10.5220/0005306601210129