Emotion Transformer: Attention Model for Pose-Based Emotion Recognition
Pedro V. V. Paiva, Pedro V. V. Paiva, Josué Ramos, Marina Gavrilova, Marco A. G. Carvalho
2023
Abstract
Capturing humans’ emotional states from images in real-world scenarios is a key problem in affective computing, which has various real-life applications. Emotion recognition methods can enhance video games to increase engagement, help students to keep motivated during e-learning sections, or make interaction more natural in social robotics. Body movements, a crucial component of non-verbal communication, remain less explored in the domain of emotion recognition, while face expression-based methods are widely investigated. Transformer networks have been successfully applied across several domains, bringing significant breakthroughs. Transformers’ self-attention mechanism captures relationships between different features across different spatial locations, allowing contextual information extraction. In this work, we introduce Emotion Transformer, a self-attention architecture leveraging spatial configurations of body joints for Body Emotion Recognition. Our approach is based on the visual transformer linear projection function, allowing the conversion of 2D joint coordinates to a regular matrix representation. The matrix projection then feeds a regular transformer multi-head attention architecture. The developed method allows a more robust correlation between joint movements with time to recognize emotions using contextual information learning. We present an evaluation benchmark for acted emotional sequences extracted from movie scenes using the BoLD dataset. The proposed methodology outperforms several state-of-the-art architectures, proving the effectiveness of the method.
DownloadPaper Citation
in Harvard Style
V. V. Paiva P., Ramos J., Gavrilova M. and A. G. Carvalho M. (2023). Emotion Transformer: Attention Model for Pose-Based Emotion Recognition. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP; ISBN 978-989-758-634-7, SciTePress, pages 274-281. DOI: 10.5220/0011791700003417
in Bibtex Style
@conference{visapp23,
author={Pedro V. V. Paiva and Josué Ramos and Marina Gavrilova and Marco A. G. Carvalho},
title={Emotion Transformer: Attention Model for Pose-Based Emotion Recognition},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP},
year={2023},
pages={274-281},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011791700003417},
isbn={978-989-758-634-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP
TI - Emotion Transformer: Attention Model for Pose-Based Emotion Recognition
SN - 978-989-758-634-7
AU - V. V. Paiva P.
AU - Ramos J.
AU - Gavrilova M.
AU - A. G. Carvalho M.
PY - 2023
SP - 274
EP - 281
DO - 10.5220/0011791700003417
PB - SciTePress