A Group Contextual Model for Activity Recognition in Crowded Scenes

Khai N. Tran; Xu Yan; Ioannis A. Kakadiaris; Shishir K. Shah

doi:10.5220/0005258600050012

A Group Contextual Model for Activity Recognition in Crowded Scenes

Khai N. Tran, Xu Yan, Ioannis A. Kakadiaris, Shishir K. Shah

2015

Abstract

This paper presents an efficient framework for activity recognition based on analyzing group context in crowded scenes. We use graph based clustering algorithm to discover interacting groups using top-down mechanism. Using discovered interacting groups, we propose a new group context activity descriptor capturing not only the focal person’s activity but also behaviors of its neighbors. For a high-level of understanding of human activities, we propose a random field model to encode activity relationships between people in the scene. We evaluate our approach on two public benchmark datasets. The results of both the steps show that our method achieves recognition rates comparable to state-of-the-art methods for activity recognition in crowded scenes.

References

Amer, M. R. and Todorovic, S. (2011). A chains model for localizing participants of group activities in videos. In Proc. IEEE International Conference on Computer Vision.
Amer, M. R., Xie, D., Zhao, M., Todorovic, S., and Zhu, S.- C. (2012). Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In Proc. European Conference on Computer Vision, pages 187- 200.
Belongie, S., Malik, J., and Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4):509-522.
Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, pages 27:1-27:27.
Chang, M.-C., Krahnstoever, N., Lim, S., and Yu, T. (2010). Group level activity recognition in crowded environments across multiple cameras. In Proc. IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 56-63, DC, USA.
Choi, W., Shahid, K., and Savarese, S. (2009). What are they doing? : collective activity classification using spatio-temporal relationship among people. In Proc. Visual Surveillance Workshop, ICCV, pages 1282 - 1289.
Choi, W., Shahid, K., and Savarese, S. (2011). Learning context for collective activity recognition. In Proc. Computer Vision and Pattern Recognition, pages 3273 -3280, Spring CO, USA.
Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Tosato, D., Bue, A. D., Menegaz, G., and Murino, V. (2011).
Social interaction discovery by statistical analysis of f-formations. In Proc. British Machine Vision Conference, pages 23.1-23.12.
Farenzena, M., Bazzani, L., Murino, V., and Cristani, M. (2009a). Towards a subject-centered analysis for automated video surveillance. In Proc. International Conference on Image Analysis and Processing, pages 481-489, Berlin, Heidelberg.
Farenzena, M., Tavano, A., Bazzani, L., Tosato, D., Pagetti, G., Menegaz, G., Murino, V., and Cristani, M. (2009b). Social interaction by visual focus of attention in a three-dimensional environment. In Proc. Workshop on Pattern Recognition and Artificial Intelligence for Human Behavior Analysis at AI*IA.
Helbing, D. and Molnár, P. (1995). Social force model for pedestrian dynamics. Physical Review E, 51(5):4282- 4286.
Hoiem, D., Efros, A., and Hebert, M. (2006). Putting objects in perspective. In Proc. Computer Vision and Pattern Recognition, volume 2, pages 2137-2144.
Khan, S. M. and Shah, M. (2005). Detecting group activities using rigidity of formation. In Proc. ACM International Conference on Multimedia, MULTIMEDIA 7805, pages 403-406, New York, NY, USA. ACM.
Lan, T., Sigal, L., and Mori, G. (2012a). Social roles in hierarchical models for human activity recognition. In Proc. Computer Vision and Pattern Recognition, pages 1354 -1361.
Lan, T., Wang, Y., Yang, W., Robinovitch, S., and Mori, G. (2012b). Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Mehran, R., Oyama, A., and Shah, M. (2009). Abnormal crowd behavior detection using social force model. In Proc. Computer Vision and Pattern Recognition, pages 935 -942.
Mooij, J. M. (2010). libDAI: A free and open source C++ library for discrete approximate inference in graphical models. Journal of Machine Learning Research, 11:2169-2173.
Pavan, M. and Pelillo, M. (2007). Dominant sets and pairwise clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 167 -172.
Ryoo, M. and Aggarwal, J. (2011). Stochastic representation and recognition of high-level group activities. International Journal of Computer Vision, pages 183- 200.
Smith, K., Ba, S., Odobez, J.-M., and Gatica-Perez, D. (2008). Tracking the visual focus of attention for a varying number of wandering people. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30:1212 -1229.
Tran, K. (2013). Contextual Descriptors for Human Activity Recognition. PhD thesis, University of Houston.
Tran, K., Gala, A., Kakadiaris, I., and Shah, S. (2014). Activity analysis in crowded environments using social cues for group discovery and human interaction modeling. Pattern Recognition Letters, 44(0):49 - 57. Pattern Recognition and Crowd Analysis.
Vaswani, N., Roy Chowdhury, A., and Chellappa, R. (2003). Activity recognition using the dynamics of the configuration of interacting objects. In Proc. Computer Vision and Pattern Recognition, volume 2, pages II - 633-40 vol.2.
Wang, H., Klaser, A., Schmid, C., and Liu, C.-L. (2011). Action recognition by dense trajectories. In Proc. Computer Vision and Pattern Recognition, pages 3169 -3176.
Was, J., Gudowski, B., and Matuszyk, P. J. (2006). Social distances model of pedestrian dynamics. In Cellular Automata for Research and Industry, pages 492-501.

Download

Paper Citation

in Harvard Style

Tran K., Yan X., Kakadiaris I. and Shah S. (2015). A Group Contextual Model for Activity Recognition in Crowded Scenes . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 5-12. DOI: 10.5220/0005258600050012

in Bibtex Style

@conference{visapp15,
author={Khai N. Tran and Xu Yan and Ioannis A. Kakadiaris and Shishir K. Shah},
title={A Group Contextual Model for Activity Recognition in Crowded Scenes},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={5-12},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005258600050012},
isbn={978-989-758-090-1},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - A Group Contextual Model for Activity Recognition in Crowded Scenes
SN - 978-989-758-090-1
AU - Tran K.
AU - Yan X.
AU - Kakadiaris I.
AU - Shah S.
PY - 2015
SP - 5
EP - 12
DO - 10.5220/0005258600050012