Laptev, I. (2005). On space-time interest points. Interna-
tional Journal of Computer Vision, 64(2-3):107–123.
Laptev, I., Marszałek, M., Schmid, C., and Rozenfeld,
B. (2008). Learning realistic human actions from
movies. In Computer Vision and Pattern Recognition,
2008. CVPR 2008. IEEE Conference on, pages 1–8.
IEEE.
Leibe, B., Leonardis, A., and Schiele, B. (2004). Combined
object categorization and segmentation with an im-
plicit shape model. In Workshop on Statistical Learn-
ing in Computer Vision.
Lin, Y.-Y., Hua, J.-H., Tang, N. C., Chen, M.-H., and Liao,
H.-Y. M. (2014). Depth and skeleton associated action
recognition without online accessible rgb-d cameras.
In Computer Vision and Pattern Recognition (CVPR),
2014 IEEE Conference on, pages 2617–2624. IEEE.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60(2):91–110.
Maji, S. and Malik, J. (2009). Object detection using a max-
margin hough transform. In Internationnal Confer-
ence on Computer Vision and Pattern Recognition.
Peng, X., Wang, L., Wang, X., and Qiao, Y. (2014). Bag of
visual words and fusion methods for action recogni-
tion: Comprehensive study and good practice. arXiv
preprint arXiv:1405.4506.
Song, Y., Liu, S., and Tang, J. (2015). Describing trajectory
of surface patch for human action recognition on rgb
and depth videos. Signal Processing Letters, IEEE,
22(4):426–429.
Sun, J., Wu, X., Yan, S., Cheong, L., Chua, T., and Li, J.
(2009). Hierarchical spatio-temporal context model-
ing for action recognition. In Internationnal Confer-
ence on Computer Vision and Pattern Recognition.
Tenorth, M., Bandouch, J., and Beetz, M. (2009). The tum
kitchen data set of everyday manipulation activities
for motion tracking and action recognition. In Inter-
national Conference on Computer Vision Workshops.
Tian, Y., Sukthankar, R., and Shah, M. (2013). Spatiotem-
poral deformable part models for action detection. In
Computer Vision and Pattern Recognition (CVPR),
2013 IEEE Conference on. IEEE.
Ullah, M. M., Parizi, S. N., and Laptev, I. (2010). Improv-
ing bag-of-features action recognition with non-local
cues. In BMVC, volume 10, pages 95–1. Citeseer.
Wang, H., Kl¨aser, A., Schmid, C., and Liu, C.-L. (2011).
Action Recognition by Dense Trajectories. In IEEE
Conference on Computer Vision & Pattern Recog-
nition, pages 3169–3176, Colorado Springs, United
States.
Wang, H., Kl¨aser, A., Schmid, C., and Liu, C.-L. (2013).
Dense trajectories and motion boundary descriptors
for action recognition. International journal of com-
puter vision, 103(1):60–79.
Wang, H. and Schmid, C. (2013). Action recognition with
improved trajectories. In Computer Vision (ICCV),
2013 IEEE International Conference on, pages 3551–
3558. IEEE.
Wang, J., Liu, Z., Wu, Y., and Yuan, J. (2012). Mining
actionlet ensemble for action recognition with depth
cameras. In Computer Vision and Pattern Recogni-
tion (CVPR), 2012 IEEE Conference on, pages 1290–
1297. IEEE.
Wang, J., Nie, X., Xia, Y., Wu, Y., and Zhu, S.-C. (2014).
Cross-view action modeling, learning, and recogni-
tion. In Computer Vision and Pattern Recognition
(CVPR), 2014 IEEE Conference on, pages 2649–
2656. IEEE.
Wohlhart, P., Schulter, S., Kostinger, M., Roth, P., and
Bischof, H. (2012). Discriminative hough forests for
object detection. In Conference of British Machine
Vision Conference.
Xia, L. and Aggarwal, J. (2013). Spatio-temporal depth
cuboid similarity feature for activity recognition us-
ing depth camera. In Computer Vision and Pat-
tern Recognition (CVPR), 2013 IEEE Conference on,
pages 2834–2841. IEEE.
Xiaohan Nie, B., Xiong, C., and Zhu, S.-C. (2015). Joint
action recognition and pose estimation from video. In
Proceedings of the IEEE Conference on Computer Vi-
sion and Pattern Recognition, pages 1293–1301.
Yao, A., Gall, J., Fanelli, G., and Van Gool, L. (2011). Does
human action recognition benefit from pose estima-
tion? In Conference of British Machine Vision Con-
ference.
Yao, A., Gall, J., and Van Gool, L. (2010). A hough
transform-based voting framework for action recog-
nition. In Internationnal Conference on Computer Vi-
sion and Pattern Recognition.
Zhang, Y. and Chen, T. (2010). Implicit shape kernel for
discriminative learning of the hough transform detec-
tor. In Conference of British Machine Vision Confer-
ence.