SCALE-INDEPENDENT SPATIO-TEMPORAL STATISTICAL SHAPE REPRESENTATIONS FOR 3D HUMAN ACTION RECOGNITION

Marco Körner, Daniel Haase, Joachim Denzler

2012

Abstract

Since depth measuring devices for real-world scenarios became available in the recent past, the use of 3d data now comes more in focus of human action recognition. We propose a scheme for representing human actions in 3d, which is designed to be invariant with respect to the actor’s scale, rotation, and translation. Our approach employs Principal Component Analysis (PCA) as an exemplary technique from the domain of manifold learning. To distinguish actions regarding their execution speed, we include temporal information into our modeling scheme. Experiments performed on the CMU Motion Capture dataset shows promising recognition rates as well as its robustness with respect to noise and incorrect detection of landmarks.

References

  1. Bookstein, F. L. (1997). Landmark methods for forms without landmarks: Morphometrics of group differences in outline shape. Medical Image Analysis, 1(3):225-243.
  2. Bosch, J., Mitchell, S., Lelieveldt, B., Nijland, F., Kamp, O., Sonka, M., and Reiber, J. (2002). Automatic segmentation of echocardiographic sequences by active appearance motion models. IEEE Transactions on Medical Imaging, 21(11):1374-1383.
  3. Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J. (1995). Active shape models-their training and application. Computer Vision and Image Understanding, 61:38-59.
  4. Dollar, P., Rabaud, V., Cottrell, G., and Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In Proceedings of the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pages 65-72. IEEE Computer Society.
  5. Gavrila, D. (1999). The visual analysis of human movement: A survey. Computer Vision and Image Understanding, 73(1):82-98.
  6. Gorelick, L., Blank, M., Shechtman, E., Irani, M., and Basri, R. (2007). Actions as space-time shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12):2247-2253.
  7. Haase, D. and Denzler, J. (2011). Anatomical landmark tracking for the analysis of animal locomotion in x-ray videos using active appearance models. In Heyden, A. and Kahl, F., editors, Image Analysis, volume 6688 of Lecture Notes in Computer Science, pages 604-615. Springer Berlin / Heidelberg.
  8. Han, L., Wu, X., Liang, W., Hou, G., and Jia, Y. (2010). Discriminative human action recognition in the learned hierarchical manifold space. Image and Vision Computing, 28(5):836-849.
  9. Jia, K. and Yeung, D.-Y. (2008). Human action recognition using local spatio-temporal discriminant embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1-8.
  10. Junejo, I., Dexter, E., Laptev, I., and Perez, P. (2011). Viewindependent action recognition from temporal selfsimilarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):172-185.
  11. Ke, Y., Sukthankar, R., and Hebert, M. (2007). Spatiotemporal shape and flow correlation for action recognition. In Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, Visual Surveillance Workshop, pages 1-8.
  12. Laptev, I. (2005). On space-time interest points. International Journal of Computer Vision, 64(2-3):107-123.
  13. Lelieveldt, B. P. F., Ü zümcü, M., van der Geest, R. J., Reiber, J. H. C., and Sonka, M. (2003). Multi-view active appearance models for consistent segmentation of multiple standard views: application to long and short-axis cardiac mr images. In Proceedings of the 17th International Congress and Exhibition on Computer Assisted Radiology and Surgery, pages 1141-1146.
  14. Li, W., Zhang, Z., and Liu, Z. (2010). Action recognition based on a bag of 3d points. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 9-14.
  15. Oost, E., Koning, G., Sonka, M., Oemrawsingh, P., Reiber, J., and Lelieveldt, B. (2006). Automated contour detection in x-ray left ventricular angiograms using multiview active appearance models and dynamic programming.
  16. IEEE Transactions on Medical Imaging, 25(9):1158- 1171.
  17. Poppe, R. (2010). A survey on vision-based human action recognition. Image and Vision Computing, 28(6):976- 990.
  18. Schwarz, L. A., Mateus, D., Castaneda, V., and Navab, N. (2010). Manifold learning for tof-based human body tracking and activity recognition. In Proceedings of the British Machine Vision Conference, pages 80.1-80.11. BMVA Press.
  19. Schwarz, L. A., Mateus, D., and Navab, N. (2012). Recognizing multiple human activities and tracking full-body pose in unconstrained environments. Pattern Recognition, 45(1):11-23.
  20. Shen, Y., Ashraf, N., and Foroosh, H. (2008). Action recognition based on homography constraints. In Proceedings of the 19th International Conference on Pattern Recognition, pages 1-4.
  21. Shotton, J., Fitzgibbon, A. W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Proceddings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1297-1304.
  22. Sun, M.-F., Wang, S.-J., Liu, X.-H., Jia, C.-C., and Zhou, C.-G. (2011). Human action recognition using tensor principal component analysis. In Proceedings of the 4th IEEE International Conference on Computer Science and Information Technology, pages 487-491.
  23. Turaga, P., Chellappa, R., Subrahmanian, V., and Udrea, O. (2008). Machine recognition of human activities: A survey. IEEE Transactions on Circuits and Systems for Video Technology, 18(11):1473-1488.
  24. Wang, L. and Suter, D. (2007). Learning and matching of dynamic shape manifolds for human action recognition. IEEE Transactions on Image Processing, 16(6):1646- 1661.
  25. Yamazaki, M., Chen, Y.-W., and Xu, G. (2007). Human action recognition using independent component analysis. In Intelligence Techniques in Computer Games and Simulations.
  26. Zhang, J., Li, S. Z., and Wang, J. (2005). Manifold learning and applications in recognition. In Tan, Y.-P., Yap, K., and Wang, L., editors, Intelligent Multimedia Processing with Soft Computing, volume 168 of Studies in Fuzziness and Soft Computing, pages 281- 300. Springer-Verlag.
Download


Paper Citation


in Harvard Style

Körner M., Haase D. and Denzler J. (2012). SCALE-INDEPENDENT SPATIO-TEMPORAL STATISTICAL SHAPE REPRESENTATIONS FOR 3D HUMAN ACTION RECOGNITION . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8425-98-0, pages 288-294. DOI: 10.5220/0003766202880294


in Bibtex Style

@conference{icpram12,
author={Marco Körner and Daniel Haase and Joachim Denzler},
title={SCALE-INDEPENDENT SPATIO-TEMPORAL STATISTICAL SHAPE REPRESENTATIONS FOR 3D HUMAN ACTION RECOGNITION},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2012},
pages={288-294},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003766202880294},
isbn={978-989-8425-98-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - SCALE-INDEPENDENT SPATIO-TEMPORAL STATISTICAL SHAPE REPRESENTATIONS FOR 3D HUMAN ACTION RECOGNITION
SN - 978-989-8425-98-0
AU - Körner M.
AU - Haase D.
AU - Denzler J.
PY - 2012
SP - 288
EP - 294
DO - 10.5220/0003766202880294