Bag-of-Features based Activity Classification using Body-joints Data

Parul Shukla, K.K. Biswas, Prem K. Kalra

Abstract

In this paper, we propose a Bag-of-Joint-Features model for the classification of human actions from body-joints data acquired using depth sensors such as Microsoft Kinect. Our method uses novel scale and translation invariant features in spherical coordinate system extracted from the joints. These features also capture the subtle movements of joints relative to the depth axis. The proposed Bag-of-Joint-Features model uses the well known bag-of-words model in the context of joints for the representation of an action sample. We also propose to augment the Bag-of-Joint-Features model with a Hierarchical Temporal histogram model to take into account the temporal information of the body-joints sequence. Experimental study shows that the augmentation improves the classification accuracy. We test our approach on theMSR-Action3D and Cornell activity datasets using support vector machine.

References

  1. Bobick, A. F. and Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell., 23(3):257-267.
  2. Chang, C. C. and Lin, C. J. (2011). LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol., 2(3).
  3. Jin, S. Y. and Choi, H. J. Essential body-joint and atomic action detection for human activity recognition using longest common subsequence algorithm. In Computer Vision - ACCV 2012 Workshops, volume 7729 of Lecture Notes in Computer Science, pages 148-159.
  4. Koppula, H., Gupta, R., and Saxena, A. (2013). Learning human activities and object affordances from RGB-D videos. IJRR, 32(8):951-970.
  5. Laptev, I. (2005). On space-time interest points. Int. J. Comput. Vision, 64(2-3):107-123.
  6. Laptev, I., Marszalek, M., Schmid, C., and Rozenfeld, B. (2008). Learning realistic human actions from movies. In Proceedings of the 2008 Conference on Computer Vision and Pattern Recognition, Los Alamitos, CA, USA.
  7. Lazebnik, S., Schmid, C., and Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2, pages 2169-2178.
  8. Li, W., Zhang, Z., and Liu, Z. (2010). Action recognition based on a bag of 3D points. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
  9. Lv, F. and Nevatia, R. (2006). Recognition and segmentation of 3-d human action using HMM and Multi-class Adaboost. In Proceedings of the 9th European Conference on Computer Vision, pages 359-372.
  10. Ni, B., Wang, G., and Moulin, P. (2011). RGBD-HuDaAct: A color-depth video database for human daily activity recognition. In ICCV Workshops, pages 1147-1153. IEEE.
  11. Niebles, J. C., Wang, H., and Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vision, 79(3):299-318.
  12. Poppe, R. (2010). A survey on vision-based human action recognition. Image Vision Comput., 28(6):976-990.
  13. Schuldt, C., Laptev, I., and Caputo, B. (2004). Recognizing human actions: A local SVM approach. In Proceedings of the 17th International Conference on Pattern Recognition, (ICPR'04), volume 3, pages 32-36.
  14. Sung, J., Ponce, C., Selman, B., and Saxena, A. (2011). Human activity detection from RGBD images. In Association for the Advancement of Artificial Intelligence (AAAI) workshop on Pattern, Activity and Intent Recognition.
  15. Sung, J., Ponce, C., Selman, B., and Saxena, A. (2012). Unstructured human activity detection from RGBD images. In International Conference on Robotics and Automation (ICRA).
  16. Swain, M. and Ballard, D. (1991). Color indexing. In IJCV, 7(1):1132.
  17. Turaga, P. K., Chellappa, R., Subrahmanian, V. S., and Udrea, O. (2008). Machine recognition of human activities: A survey. IEEE Trans. Circuits Syst. Video Techn., 18(11):1473-1488.
  18. Wang, J., Liu, Z., Wu, Y., and Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 7812, pages 1290-1297.
  19. Yang, X. and Tian, Y. (2012). Eigenjoints-based action recognition using naïve-bayes-nearest-neighbor. In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, June 16-21, 2012, pages 14-19.
  20. Yao, A., Gall, J., Fanelli, G., and Van Gool., L. (2011). Does human action recognition benefit from pose estimation? In BMVC, pages 67.1-67.11.
Download


Paper Citation


in Harvard Style

Shukla P., Biswas K. and Kalra P. (2015). Bag-of-Features based Activity Classification using Body-joints Data . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-089-5, pages 314-322. DOI: 10.5220/0005303103140322


in Bibtex Style

@conference{visapp15,
author={Parul Shukla and K.K. Biswas and Prem K. Kalra},
title={Bag-of-Features based Activity Classification using Body-joints Data},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={314-322},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005303103140322},
isbn={978-989-758-089-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2015)
TI - Bag-of-Features based Activity Classification using Body-joints Data
SN - 978-989-758-089-5
AU - Shukla P.
AU - Biswas K.
AU - Kalra P.
PY - 2015
SP - 314
EP - 322
DO - 10.5220/0005303103140322