Hybrid Time Distributed CNN-transformer for Speech Emotion Recognition

Anwer Slimi, Anwer Slimi, Henri Nicolas, Mounir Zrigui

2022

Abstract

Due to the success of transformers in recent years, a growing number of researchers are using them in a variety of disciplines. Due to the attention mechanism, this revolutionary architecture was able to overcome some of the limitations associated with classic deep learning models. Nonetheless, despite their profitable structures, transformers have drawbacks. We introduce a novel hybrid architecture for Speech Emotion Recognition (SER) systems in this article that combines the benefits of transformers and other deep learning models.

Download


Paper Citation


in Harvard Style

Slimi A., Nicolas H. and Zrigui M. (2022). Hybrid Time Distributed CNN-transformer for Speech Emotion Recognition. In Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-588-3, pages 602-611. DOI: 10.5220/0011314900003266


in Bibtex Style

@conference{icsoft22,
author={Anwer Slimi and Henri Nicolas and Mounir Zrigui},
title={Hybrid Time Distributed CNN-transformer for Speech Emotion Recognition},
booktitle={Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2022},
pages={602-611},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011314900003266},
isbn={978-989-758-588-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,
TI - Hybrid Time Distributed CNN-transformer for Speech Emotion Recognition
SN - 978-989-758-588-3
AU - Slimi A.
AU - Nicolas H.
AU - Zrigui M.
PY - 2022
SP - 602
EP - 611
DO - 10.5220/0011314900003266