Skeleton-based Online Sign Language Recognition using Monotonic Attention
Natsuki Takayama, Gibran Benitez-Garcia, Hiroki Takahashi, Hiroki Takahashi
2022
Abstract
Sequence-to-sequence models have been successfully applied to improve continuous sign language word recognition in recent years. Although various methods for continuous sign language word recognition have been proposed, these methods assume offline recognition and lack further investigation in online and streaming situations. In this study, skeleton-based continuous sign language word recognition for online situations was investigated. A combination of spatial-temporal graph convolutional networks and recurrent neural networks with soft attention was employed as the base model. Further, three types of monotonic attention techniques were applied to extend the base model for online recognition. The monotonic attention included hard monotonic attention, monotonic chunkwise attention, and monotonic infinite lookback attention. The performance of the proposed models was evaluated in offline and online recognition settings. A conventional Japanese sign language video dataset, including 275 types of isolated word videos and 113 types of sentence videos, was utilized to evaluate the proposed models. The results showed that the effectiveness of monotonic attention to online continuous sign language word recognition.
DownloadPaper Citation
in Harvard Style
Takayama N., Benitez-Garcia G. and Takahashi H. (2022). Skeleton-based Online Sign Language Recognition using Monotonic Attention. In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 5: VISAPP; ISBN 978-989-758-555-5, SciTePress, pages 601-608. DOI: 10.5220/0010899400003124
in Bibtex Style
@conference{visapp22,
author={Natsuki Takayama and Gibran Benitez-Garcia and Hiroki Takahashi},
title={Skeleton-based Online Sign Language Recognition using Monotonic Attention},
booktitle={Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 5: VISAPP},
year={2022},
pages={601-608},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010899400003124},
isbn={978-989-758-555-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 5: VISAPP
TI - Skeleton-based Online Sign Language Recognition using Monotonic Attention
SN - 978-989-758-555-5
AU - Takayama N.
AU - Benitez-Garcia G.
AU - Takahashi H.
PY - 2022
SP - 601
EP - 608
DO - 10.5220/0010899400003124
PB - SciTePress