Authors:
Nassim Mokhtari
;
Alexis Nédélec
and
Pierre De Loor
Affiliation:
Lab-STICC (CNRS UMR 6285), ENIB, Centre Européen de Réalité Virtuelle, Brest, France
Keyword(s):
3D Skeleton Data, Spatio-temporal Image Encoding, Sliding Window, Online Action Recognition, Human Activity Recognition, Deep Learning.
Abstract:
Human activity recognition (HAR) based on skeleton data that can be extracted from videos (Kinect for example) , or provided by a depth camera is a time series classification problem, where handling both spatial and temporal dependencies is a crucial task, in order to achieve a good recognition. In the online human activity recognition, identifying the beginning and end of an action is an important element, that might be difficult in a continuous data flow. In this work, we present a 3D skeleton data encoding method to generate an image that preserves the spatial and temporal dependencies existing between the skeletal joints.To allow online action detection we combine this encoding system with a sliding window on the continous data stream. By this way, no start or stop timestamp is needed and the recognition can be done at any moment. A deep learning CNN algorithm is used to achieve actions online detection.