Authors:
Carlos Caetano
;
Jefersson A. dos Santos
and
William Robson Schwartz
Affiliation:
Universidade Federal de Minas Gerais, Brazil
Keyword(s):
Spatiotemporal Features, Co-occurrence, Action Recognition.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Features Extraction
;
Image and Video Analysis
;
Motion, Tracking and Stereo Vision
;
Video Surveillance and Event Detection
Abstract:
In this paper, we propose a novel spatiotemporal feature representation based on co-occurrence matrices of codewords, called Co-occurrence of Codewords (CCW), to tackle human action recognition, a significant problem for many real-world applications, such as surveillance, video retrieval and health care. The method captures local relationships among the codewords (densely sampled), through the computation of a set of statistical measures known as Haralick textural features. We apply a classical visual recognition pipeline in which involves the extraction of spatiotemporal features and SVM classification. We investigate the proposed representation in three well-known and publicly available datasets for action recognition (KTH, UCF Sports and HMDB51) and show that it outperforms the results achieved by several widely employed spatiotemporal features available in the literature encoded by a Bag-of-Words model with a more compact representation.