M5AIE - A Method for Body Part Detection and Tracking using RGB-D Images

Andre Brandao, Leandro A. F. Fernandes, Esteban Clua

2014

Abstract

The automatic detection and tracking of human body parts in color images is highly sensitive to appearance features such as illumination, skin color and clothes. As a result, the use of depth images has been shown to be an attractive alternative over color images due to its invariance to lighting conditions. However, body part detection and tracking is still a challenging problem, mainly because the shape and depth of the imaged body can change depending on the perspective. We present a hybrid approach, called M5AIE, that uses both color and depth information to perform body part detection, tracking and pose classification. We have developed a modified Accumulative Geodesic Extrema (AGEX) approach for detecting body part candidates. We also have used the Affine-SIFT (ASIFT) algorithm for feature extraction, and we have adapted the conventional matching method to perform tracking and labeling of body parts in a sequence of images that has color and depth information. The results produced by our tracking system were used with the C4.5 Gain Ratio Decision Tree, the naïve Bayes and the KNN classification algorithms for the identification of the users pose.

References

  1. Baak, A., Müller, M., Bharaj, G., Seidel, H.-P., and Theobalt, C. (2011). A data-driven approach for realtime full body pose reconstruction from a depth camera. In IEEE 13th International Conference on Computer Vision, pages 1092-1099, Bacelona, Spain.
  2. Blum, H. (1967). A transformation for extracting new descriptors of shape. Models for the perception of speech and visual form, 19(5):362-380.
  3. Branda˜o, A., Branda˜o, L., Nascimento, G., Moreira, B., Vasconcelos, C. N., and Clua, E. (2010). Jecripe: stimulating cognitive abilities of children with down syndrome in pre-scholar age using a game approach. In Proceedings of the 7th International Conference on Advances in Computer Entertainment Technology, ACE 7810, pages 15-18, New York, NY, USA. ACM.
  4. Branda˜o, A., Fernandes, L. A. F., and Clua, E. (2013). A comparative analysis of classification algorithms applied to m5aie-extracted human poses. In Proceedings of the XII Brazilian Symposium on Games and Digital Entertainment, SBGAMES 7813.
  5. Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. Information Theory, IEEE Transactions on, 13(1):21-27.
  6. Dijkstra, E. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1):269-271.
  7. Domingos, P. and Pazzani, M. (1997). On the optimality of the simple bayesian classifier under zero-one loss. Machine learning, 29(2):103-130.
  8. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., and Burgard, W. (2012). An evaluation of the rgbd slam system. In Robotics and Automation (ICRA), 2012 IEEE International Conference on, pages 1691- 1696.
  9. Ganapathi, V., Plagemann, C., Thrun, S., and Koller, D. (2010). Real time motion capture using a single timeof-flight camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 755-762, San Francisco, CA, USA.
  10. Gonzalez, R. C. and Woods, R. E. (2008). Digital Image Processing. Prentice Hall, 3rd edition.
  11. Greff, K., Branda˜o, A., Krauß, S., Stricker, D., and Clua, E. (2012). A comparison between background subtraction algorithms using a consumer depth camera. In Proceedings of International Conference on Computer Vision Theory and Applications - VISAPP, volume 1, pages 431-436, Rome, Italy. SciTePress.
  12. Henry, P., Krainin, M., Herbst, E., Ren, X., and Fox, D. (2012). RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. The International Journal of Robotics Research, 31(5):647-663.
  13. Lai, K., Bo, L., Ren, X., and Fox, D. (2011). Sparse distance learning for object recognition combining rgb and depth information. In IEEE International Conference on on Robotics and Automation.
  14. May, S., Droeschel, D., Holz, D., Wiesen, C., and Fuchs, S. (2008). 3d pose estimation and mapping with time-offlight cameras. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Workshop on 3D Mapping, pages 1-6, Nice, France.
  15. Morel, J.-M. and Yu, G. (2009). Asift: A new framework for fully affine invariant image comparison. SIAM J. Img. Sci., 2(2):438-469.
  16. Mota, V., Perez, E., Vieira, M., Maciel, L., Precioso, F., and Gosselin, P. (2012). A tensor based on optical flow for global description of motion in videos. In Graphics, Patterns and Images (SIBGRAPI), 2012 25th SIBGRAPI Conference on, pages 298-301.
  17. Plagemann, C., Ganapathi, V., Koller, D., and Thrun, S. (2010). Real-time identification and localization of body parts from depth images. In Proceedings of the IEEE International Conference on Robotics & Automation (ICRA), pages 3108-3113, Anchorage, Alaska, USA.
  18. Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
  19. Schwarz, L. A., Mkhitaryan, A., Mateus, D., and Navab, N. (2012). Human skeleton tracking from depth data using geodesic distances and optical flow. Image Vision Comput., 30(3):217-226.
  20. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1297-1304, Colorado Springs, CO, USA.
  21. Silberman, N. and Fergus, R. (2011). Indoor scene segmentation using a structured light sensor. In Proceedings of the International Conference on Computer Vision - Workshop on 3D Representation and Recognition, pages 601-608.
  22. Stone, E. and Skubic, M. (2011). Evaluation of an inexpensive depth camera for passive in-home fall risk assessment. In 5th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth), 2011, pages 71 -77.
  23. Ye, M., Wang, X., Yang, R., Ren, L., and Pollefeys, M. (2011). Accurate 3d pose estimation from a single depth image. In Proceedings of International Conference on Computer Vision, pages 731-738. IEEE.
Download


Paper Citation


in Harvard Style

Brandao A., Fernandes L. and Clua E. (2014). M5AIE - A Method for Body Part Detection and Tracking using RGB-D Images . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-003-1, pages 367-377. DOI: 10.5220/0004738003670377


in Bibtex Style

@conference{visapp14,
author={Andre Brandao and Leandro A. F. Fernandes and Esteban Clua},
title={M5AIE - A Method for Body Part Detection and Tracking using RGB-D Images},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={367-377},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004738003670377},
isbn={978-989-758-003-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2014)
TI - M5AIE - A Method for Body Part Detection and Tracking using RGB-D Images
SN - 978-989-758-003-1
AU - Brandao A.
AU - Fernandes L.
AU - Clua E.
PY - 2014
SP - 367
EP - 377
DO - 10.5220/0004738003670377