Authors:
Victoria Manousaki
1
;
2
and
Antonis Argyros
1
;
2
Affiliations:
1
Computer Science Department, University of Crete, Heraklion, Greece
;
2
Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece
Keyword(s):
Soft Dynamic Time Warping, Segregational, Action Prediction, Alignment, Duration Prediction.
Abstract:
Aligning the execution of complete actions captured in segmented videos has been a problem explored by Dynamic Time Warping (DTW) and Soft Dynamic Time Warping (S-DTW) algorithms. The limitation of these algorithms is that they cannot align unsegmented actions, i.e., actions that appear between other actions. This limitation is mitigated by the use of two existing DTW variants, namely the Open-End DTW (OE-DTW) and the Open-Begin-End DTW (OBE-DTW). OE-DTW is designed for aligning actions of known begin point but unknown end point, while OBE-DTW handles continuous, completely unsegmented actions with unknown begin and end points. In this paper, we combine the merits of S-DTW with those of OE-DTW and OBE-DTW. In that direction, we propose two new DTW variants, the Open-End Soft DTW (OE-S-DTW) and the Open-Begin-End Soft DTW (OBE-S-DTW). The superiority of the proposed algorithms lies in the combination of the soft-minimum operator and the relaxation of the boundary constraints of S-DTW,
with the segregational capabilities of OE-DTW and OBE-DTW, resulting in better and differentiable action alignment in the case of continuous, unsegmented videos. We evaluate the proposed algorithms on the task of action prediction on standard datasets such as MHAD, MHAD101-v/-s, MSR Daily Activities and CAD-120. Our experimental results show the superiority of the proposed algorithms to existing video alignment methods.
(More)