Social Cues in Group Formation and Local Interactions for Collective Activity Analysis

Khai N. Tran, Apurva Bedagkar-Gala, Ioannis A. Kakadiaris, Shishir K. Shah

2013

Abstract

This paper presents a novel and efficient framework for group activity analysis. People in a scene can be intuitively represented by an undirected graph where vertices are people and the edges between two people are weighted by how much they are interacting. Social signaling cues are used to describe the degree of interaction between people. We propose a graph-based clustering algorithm to discover interacting groups in crowded scenes. The grouping of people in the scene serves to isolate the groups engaged in the dominant activity, effectively eliminating dataset contamination. Using discovered interacting groups, we create a descriptor capturing the motion and interaction of people within it. A bag-of-words approach is used to represent group activity and a SVM classifier is used for activity recognition. The proposed framework is evaluated in its ability to discover interacting groups and perform group activity recognition using two public datasets. The results of both the steps show that our method outperforms state-of-the-art methods for group discovery and achieves recognition rates comparable to state-of-the-art methods for group activity recognition.

References

  1. Amer, M. R. and Todorovic, S. (2011). A chains model for localizing participants of group activities in videos. In Proc. IEEE International Conference on Computer Vision.
  2. Brendel, W. and Todorovic, S. (2011). Learning spatiotempoal graphs of human activities. In Proc. IEEE International Conference on Computer Vision, Barcelona, Spain.
  3. Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, pages 27:1-27:27.
  4. Chang, M.-C., Krahnstoever, N., Lim, S., and Yu, T. (2010). Group level activity recognition in crowded environments across multiple cameras. In Proc. IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 56-63, DC, USA.
  5. Choi, W., Shahid, K., and Savarese, S. (2009). What are they doing? : collective activity classification using spatio-temporal relationship among people. In Proc. Visual Surveillance Workshop, ICCV, pages 1282 - 1289.
  6. Choi, W., Shahid, K., and Savarese, S. (2011). Learning context for collective activity recognition. In Proc. Computer Vision and Pattern Recognition, pages 3273 -3280, Spring CO, USA.
  7. Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Tosato, D., Bue, A. D., Menegaz, G., and Murino, V. (2011). Social interaction discovery by statistical analysis of f-formations. In Proc. British Machine Vision Conference, pages 23.1-23.12.
  8. Farenzena, M., Bazzani, L., Murino, V., and Cristani, M. (2009a). Towards a subject-centered analysis for automated video surveillance. In Proc. International Conference on Image Analysis and Processing, pages 481-489, Berlin, Heidelberg.
  9. Farenzena, M., Tavano, A., Bazzani, L., Tosato, D., Pagetti, G., Menegaz, G., Murino, V., and Cristani, M. (2009b). Social interaction by visual focus of attention in a three-dimensional environment. In Proc. Workshop on Pattern Recognition and Artificial Intelligence for Human Behavior Analysis at AI*IA.
  10. Gaur, U., Zhu, Y., Song, B., and Roy-Chowdhury, A. (2011). A ”string of feature graphs” model for recognition of complex activties in natural videos. In Proc. IEEE International Conference on Computer Vision, Barcelona, Spain.
  11. Helbing, D. and Molnár, P. (1995). Social force model for pedestrian dynamics. Physical Review E, 51(5):4282- 4286.
  12. Lan, T., Wang, Y., Mori, G., and Robinovitch, S. (2010). Retrieving actions in group contexts. In Proc. International Workshop on Sign Gesture Activity.
  13. Lan, T., Wang, Y., Yang, W., Robinovitch, S., and Mori, G. (2011). Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, (99):1.
  14. Pavan, M. and Pelillo, M. (2007). Dominant sets and pairwise clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 167 -172.
  15. Ryoo, M. and Aggarwal, J. (2011). Stochastic representation and recognition of high-level group activities. International Journal of Computer Vision, pages 183- 200.
  16. Smith, K., Ba, S., Odobez, J.-M., and Gatica-Perez, D. (2008). Tracking the visual focus of attention for a varying number of wandering people. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30:1212 -1229.
  17. Tran, K., Kakadiaris, I., and Shah, S. (2012). Part-based motion descriptor image for human action recognition. Pattern Recognition, 45(7):2562-2572.
  18. Vinciarelli, A., Pantic, M., and Bourlard, H. (2008). Social signal processing: survey of an emerging domain. Image and Vision Computing, pages 1743 - 1759.
  19. Was, J., Gudowski, B., and Matuszyk, P. J. (2006). Social distances model of pedestrian dynamics. In Cellular Automata for Research and Industry, pages 492-501.
Download


Paper Citation


in Harvard Style

N. Tran K., Bedagkar-Gala A., A. Kakadiaris I. and K. Shah S. (2013). Social Cues in Group Formation and Local Interactions for Collective Activity Analysis . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 539-548. DOI: 10.5220/0004256505390548


in Bibtex Style

@conference{visapp13,
author={Khai N. Tran and Apurva Bedagkar-Gala and Ioannis A. Kakadiaris and Shishir K. Shah},
title={Social Cues in Group Formation and Local Interactions for Collective Activity Analysis},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},
year={2013},
pages={539-548},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004256505390548},
isbn={978-989-8565-47-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI - Social Cues in Group Formation and Local Interactions for Collective Activity Analysis
SN - 978-989-8565-47-1
AU - N. Tran K.
AU - Bedagkar-Gala A.
AU - A. Kakadiaris I.
AU - K. Shah S.
PY - 2013
SP - 539
EP - 548
DO - 10.5220/0004256505390548