Authors:
Luca Ciampi
1
;
Carlos Santiago
2
;
Joao Costeira
2
;
Fabrizio Falchi
1
;
Claudio Gennaro
1
and
Giuseppe Amato
1
Affiliations:
1
Institute of Information Science and Technologies, National Research Council, Pisa, Italy
;
2
Instituto Superior Técnico (LARSyS/IST), Lisbon, Portugal
Keyword(s):
Video Violence Detection, Video Violence Classification, Action Recognition, Unsupervised Domain Adaptation, Deep Learning, Deep Learning for Visual Understanding, Video Surveillance.
Abstract:
Video violence detection is a subset of human action recognition aiming to detect violent behaviors in trimmed video clips. Current Computer Vision solutions based on Deep Learning approaches provide astonishing results. However, their success relies on large collections of labeled datasets for supervised learning to guarantee that they generalize well to diverse testing scenarios. Although plentiful annotated data may be available for some pre-specified domains, manual annotation is unfeasible for every ad-hoc target domain or task. As a result, in many real-world applications, there is a domain shift between the distributions of the train (source) and test (target) domains, causing a significant drop in performance at inference time. To tackle this problem, we propose an Unsupervised Domain Adaptation scheme for video violence detection based on single image classification that mitigates the domain gap between the two domains. We conduct experiments considering as the source labele
d domain some datasets containing violent/non-violent clips in general contexts and, as the target domain, a collection of videos specific for detecting violent actions in public transport, showing that our proposed solution can improve the performance of the considered models.
(More)