Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data

Stijn De Beugher, Geert Brône, Toon Goedemé

2015

Abstract

In this paper we present a highly accurate algorithm for the detection of human hands in real-life 2D image sequences. Current state of the art algorithms show relatively poor detection accuracy results on unconstrained, challenging images. To overcome this, we introduce a detection scheme in which we combine several well known detection techniques combined with an advanced elimination mechanism to reduce false detections. Furthermore we present a novel (semi-)automatic framework achieving detection rates up to 100%, with only minimal manual input. This is a useful tool in supervised applications where an error-free detection result is required at the cost of a limited amount of manual effort. As an application, this paper focuses on the analysis of video data of human-human interaction, collected with the scene camera of mobile eye-tracking glasses. This type of data is typically annotated manually for relevant features (e.g. visual fixations on gestures), which is a time-consuming, tedious and error-prone task. The usage of our semi-automatic approach reduces the amount of manual analysis dramatically. We also present a new fully annotated benchmark dataset on this application which we made publicly available.

References

  1. Al Moubayed, S., Edlund, J., and Gustafson, J. (2013). Analysis of gaze and speech patterns in three-party quiz game interaction. In Interspeech 2013.
  2. Bo, N., Dailey, M. N., and Uyyanonvara, B. (2007). Robust hand tracking in low-resolution video sequences. In Proc of the third conference on IASTED International Conference: Advances in Computer Science and Technology, pages 228-233, Anaheim, CA, USA.
  3. Broˆne, G. and Oben, B. (2014). Insight interaction. A multimodal and multifocal dialogue corpus. In Language Resources and Evaluation.
  4. Buehler, P., Everingham, M., Huttenlocher, D., and Zisserman, A. (2008). Long term arm and hand tracking for continuous sign language tv broadcasts. In Proceedings of the British Machine Vision Conference, pages 110.1-110.10. BMVA Press.
  5. Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR, pages 886- 893.
  6. De Beugher, S., Broˆne, G., and Goedemé, T. (2014). Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In Proc. of the 9th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.
  7. Dubout, C. and Fleuret, F. (2012). Exact acceleration of linear object detectors. In Proc. of the European Conference on Computer Vision (ECCV), pages 301-311.
  8. Eichner, M., Marin-Jimenez, M., Zisserman, A., and Ferrari, V. (2012). 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. International Journal of Computer Vision, 99:190- 214.
  9. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627-1645.
  10. Gebre, B. G., Wittenburg, P., and Lenkiewicz, P. (2012). Towards automatic gesture stroke detection. In the Eighth International Conference on Language Resources and Evaluation, pages 231-235.
  11. Jokinen, K. (2010). Non-verbal signals for turn-taking and feedback. In Proc. of the Seventh International Conference on Language Resources and Evaluation.
  12. Kalman, R. (1960). A new approach to linear filtering and prediction problems. In Transaction of the ASME Journal of Basic Engineering, volume 82, pages 35- 45.
  13. Karlinsky, L., Dinerstein, M., Harari, D., and Ullman, S. (2010). The chains model for detecting parts by their context. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 25-32.
  14. Mittal, A., Zisserman, A., and Torr, P. (2011). Hand detection using multiple proposals. In Proc. of the British Machine Vision Conference, pages 75.1- 75.11. BMVA Press.
  15. N. A. Abdul Rahim, C. W. Kit, J. S. (2006). RGB-H-CbCr skin colour model for human face detection. In MMU International Symposium on Information and Communications Technologies (M2USIC), Petaling Jaya, Malaysia.
  16. Pfister, T., Charles, J., Everingham, M., and Zisserman, A. (2012). Automatic and efficient long term arm and hand tracking for continuous sign language TV broadcasts. In British Machine Vision Conference.
  17. Spruyt, V., Ledda, A., and Philips, W. (2013). Realtime, long-term hand tracking with unsupervised initialization. In Proceedings of the IEEE International Conference on Image Processing, pages 3730-3734. IEEE.
  18. Van den Bergh, M. and Van Gool, L. (2011). Combining rgb and tof cameras for real-time 3d hand gesture interaction. In Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV), WACV 7811, pages 66-72, Washington, DC, USA. IEEE Computer Society.
  19. Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. pages 511-518. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
  20. Wang, R. Y. and Popovic, J. (2009). Real-time handtracking with a color glove. In ACM SIGGRAPH 2009 Papers, pages 63:1-63:8.
  21. Williams, G., Bregler, C., Hackney, P., Rosenthal, S., Mcdowall, I., and Smolskiy, K. (2008). Body signature recognition.
  22. Wu, Y., Liu, Q., and Huang, T. S. (2000). An adaptive selforganizing color segmentation algorithm with application to robust real-time human hand localization. In in Proc. of Asian Conference on Computer Vision, pages 1106-1111.
  23. Yang, Y. and Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1385-1392. IEEE.
Download


Paper Citation


in Harvard Style

De Beugher S., Brône G. and Goedemé T. (2015). Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 121-129. DOI: 10.5220/0005306601210129


in Bibtex Style

@conference{visapp15,
author={Stijn De Beugher and Geert Brône and Toon Goedemé},
title={Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={121-129},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005306601210129},
isbn={978-989-758-090-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - Semi-automatic Hand Detection - A Case Study on Real Life Mobile Eye-tracker Data
SN - 978-989-758-090-1
AU - De Beugher S.
AU - Brône G.
AU - Goedemé T.
PY - 2015
SP - 121
EP - 129
DO - 10.5220/0005306601210129