3D Human Poses Estimation from a Single 2D Silhouette

Fabrice Dieudonné Atrevi; Damien Vivet; Florent Duculty; Bruno Emile

doi:10.5220/0005711503610369

3D Human Poses Estimation from a Single 2D Silhouette

Fabrice Dieudonné Atrevi, Damien Vivet, Florent Duculty, Bruno Emile

2016

Abstract

This work focuses on the problem of automatically extracting human 3D poses from a single 2D image. By pose we mean the configuration of human bones in order to reconstruct a 3D skeleton representing the 3D posture of the detected human. This problem is highly non-linear in nature and confounds standard regression techniques. Our approach combines prior learned correspondences between silhouettes and skeletons extracted from 3D human models. In order to match detected silhouettes with simulated silhouettes, we used Krawtchouk geometric moment as shape descriptor. We provide quantitative results for image retrieval across different action and subjects, captured from differing viewpoints. We show that our approach gives promising result for 3D pose extraction from a single silhouette.

References

Agarwal, A. and Triggs, B. (2006). Recovering 3D human pose from monocular images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28(1):44-58.
Aggarwal, J. and Cai, Q. (1999). Human motion analysis: A review. Computer Vision and Image Understanding, 73(3):428-440.
Andriluka, M., Roth, S., and Schiele, B. (2010). Monocular 3d pose estimation and tracking by detection. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 623-630. IEEE.
Baumberg, A. and Hogg, D. (1994). Learning flexible models from image sequences. Springer.
Blank, M., Gorelick, L., Shechtman, E., Irani, M., and Basri, R. (2005). Actions as space-time shapes. In The Tenth IEEE International Conference on Computer Vision (ICCV'05), pages 1395-1402.
Bourdev, L. and Malik, J. (2009). Poselets: Body part detectors trained using 3d human pose annotations. In Computer Vision, 2009 IEEE 12th International Conference on, pages 1365-1372. IEEE.
Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In In: IEEE Conference on Computer Vision and Pattern Recognition, pages 886-893.
de La Gorce, M., Fleet, D., and Paragios, N. (2011). Model-based 3d hand pose estimation from monocular video. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(9):1793-1805.
Gavrila, D. M. and Davis, L. S. (1996). 3-d model-based tracking of humans in action: a multi-view approach. In Computer Vision and Pattern Recognition, 1996. Proceedings CVPR'96, 1996 IEEE Computer Society Conference on, pages 73-80. IEEE.
Gorelick, L., Blank, M., Shechtman, E., Irani, M., and Basri, R. (2005). Actions as space-time shapes. In In ICCV, pages 1395-1402.
Gorelick, L., Blank, M., Shechtman, E., Irani, M., and Basri, R. (2007). Actions as space-time shapes. Transactions on Pattern Analysis and Machine Intelligence, 29(12):2247-2253.
Guo, K., Ishwar, P., and Konrad, J. (2009). Action recognition in video by covariance matching of silhouette tunnels. In In: XXII Brazilian Symposium on Computer Graphics and Image Processing, pages 299-306.
Hiyadi, H., Ababsa, F., Bouyakhf, E. H., Regragui, F., and Montagne, C. (2015). Reconnaissance 3d des gestes pour l'interaction naturelle homme robot. In Journées francophones des jeunes chercheurs en vision par ordinateur.
Hogg, D. (1983). Model-based vision: a program to see a walking person. Image and Vision computing, 1(1):5- 20.
Jiang, H. (2010). 3d human pose reconstruction using millions of exemplars. In Pattern Recognition (ICPR), 2010 20th International Conference on, pages 1674- 1677.
Lee, M. W. and Nevatia, R. (2009). Human pose tracking in monocular sequence using multilevel structured models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(1):27-38.
Maji, S., Bourdev, L., and Malik, J. (2011). Action recognition from a distributed representation of pose and appearance. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 3177- 3184. IEEE.
Mori, G. and Malik, J. (2002). Estimating human body configurations using shape context matching. In Computer VisionECCV 2002, pages 666-680. Springer.
O'Rourke, J., Badler, N., et al. (1980). Model-based image analysis of human motion using constraint propagation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (6):522-536.
P. Dollar, S. B. and Perona, P. (2010). The fastest pedestrian detector in the west. In In: Proceedings of the British Machine Vision Conference, pages 1-11.
Rehg, J. M. and Kanade, T. (1994). Visual tracking of high dof articulated structures: an application to human hand tracking. In Computer VisionECCV'94, pages 35-46. Springer.
Taylor, C. (2000). Reconstruction of articulated objects from point correspondences in a single uncalibrated image. In Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, volume 1, pages 677-684 vol.1.
Valmadre, J. and Lucey, S. (2010). Deterministic 3d human pose estimation using rigid structure. In Computer Vision-ECCV 2010, pages 467-480. Springer.
Wang, C., Wang, Y., and Yuille, A. (2013). An approach to pose-based action recognition. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 915-922.
Wang, L., Wang, Y., and Gao, W. (2011). Mining layered grammar rules for action recognition. International Journal of Computer Vision, 93(2):162-182.
Wei, X. K. and Chai, J. (2009). Modeling 3d human poses from uncalibrated monocular images. In Computer Vision, 2009 IEEE 12th International Conference on, pages 1873-1880. IEEE.
Wren, C. R., Azarbayejani, A., Darrell, T., and Pentland, A. P. (1997). Pfinder: Real-time tracking of the human body. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 19(7):780-785.
Yang, Y. and Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1385-1392. IEEE.
Yap, P.-T., Paramesran, R., and Ong, S.-H. (2003). Image analysis by krawtchouk moments. Image Processing, IEEE Transactions on, 12(11):1367-1377.

Download

Paper Citation

in Harvard Style

Atrevi F., Vivet D., Duculty F. and Emile B. (2016). 3D Human Poses Estimation from a Single 2D Silhouette . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 361-369. DOI: 10.5220/0005711503610369

in Bibtex Style

@conference{visapp16,
author={Fabrice Dieudonné Atrevi and Damien Vivet and Florent Duculty and Bruno Emile},
title={3D Human Poses Estimation from a Single 2D Silhouette},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={361-369},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005711503610369},
isbn={978-989-758-175-5},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - 3D Human Poses Estimation from a Single 2D Silhouette
SN - 978-989-758-175-5
AU - Atrevi F.
AU - Vivet D.
AU - Duculty F.
AU - Emile B.
PY - 2016
SP - 361
EP - 369
DO - 10.5220/0005711503610369