Authors:
Jordi Bautista-Ballester
1
;
Jaume Jaume Vergés-Llahí
2
and
Domenec Puig
3
Affiliations:
1
ATEKNEA Solutions and Universitat Rovira i Virgili, Spain
;
2
ATEKNEA Solutions, Spain
;
3
Universitat Rovira i Virgili, Spain
Keyword(s):
Multimodal Learning, Action Recognition, Bag of Visual Words, Multikernel Support Vector Machines.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Applications and Services
;
Computer Vision, Visualization and Computer Graphics
;
Enterprise Information Systems
;
Human and Computer Interaction
;
Human-Computer Interaction
;
Pattern Recognition
;
Robotics
;
Software Engineering
Abstract:
Understanding human activities is one of the most challenging modern topics for robots. Either for imitation or anticipation, robots must recognize which action is performed by humans when they operate in a human environment. Action classification using a Bag of Words (BoW) representation has shown computational simplicity and good performance, but the increasing number of categories, including actions with high confusion, and the addition, especially in human robot interactions, of significant contextual and multimodal information has led most authors to focus their efforts on the combination of image descriptors. In this field, we propose the Contextual and Modal MultiKernel Learning Support Vector Machine (CMMKL-SVM). We introduce contextual information -objects directly related to the performed action by calculating the codebook from a set of points belonging to objects- and multimodal information -features from depth and 3D images resulting in a set of two extra modalities of in
formation in addition to RGB images-. We code the action videos using a BoW representation with both contextual and modal information and introduce them to the optimal SVM kernel as a linear combination of single kernels weighted by learning. Experiments have been carried out on two action databases, CAD-120 and HMDB. The upturn achieved with our approach attained the same results for high constrained databases with respect to other similar approaches of the state of the art and it is much better as much realistic is the database, reaching a performance improvement of 14.27 % for HMDB.
(More)