Recognising Actions for Instructional Training using Pose Information: A Comparative Evaluation

Seán Bruton, Gerard Lacey

Abstract

Humans perform many complex tasks involving the manipulation of multiple objects. Recognition of the constituent actions of these tasks can be used to drive instructional training systems. The identities and poses of the objects used during such tasks are salient for the purposes of recognition. In this work, 3D object detection and registration techniques are used to identify and track objects involved in an everyday task of preparing a cup of tea. The pose information serves as input to an action classification system that uses Long-Short Term Memory (LSTM) recurrent neural networks as part of a deep architecture. An advantage of this approach is that it can represent the complex dynamics of object and human poses at hierarchical levels without the need for design of specific spatio-temporal features. By using such compact features, we demonstrate the feasibility of using the hyperparameter optimisation technique of Tree-Parzen Estimators to identify optimal hyperparameters as well as network architectures. The results of 83% recognition show that this approach is viable for similar scenarios of pervasive computing applications where prior scene knowledge exists.

Download


Paper Citation


in Harvard Style

Bruton S. and Lacey G. (2019). Recognising Actions for Instructional Training using Pose Information: A Comparative Evaluation.In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, ISBN 978-989-758-354-4, pages 482-489. DOI: 10.5220/0007395304820489


in Bibtex Style

@conference{visapp19,
author={Seán Bruton and Gerard Lacey},
title={Recognising Actions for Instructional Training using Pose Information: A Comparative Evaluation},
booktitle={Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP,},
year={2019},
pages={482-489},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007395304820489},
isbn={978-989-758-354-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP,
TI - Recognising Actions for Instructional Training using Pose Information: A Comparative Evaluation
SN - 978-989-758-354-4
AU - Bruton S.
AU - Lacey G.
PY - 2019
SP - 482
EP - 489
DO - 10.5220/0007395304820489