A Data Augmentation Approach for Improving the Performance of Speech Emotion Recognition

Georgia Paraskevopoulou, Evaggelos Spyrou, Evaggelos Spyrou, Stavros Perantonis

2022

Abstract

The recognition of the emotions of humans is crucial for various applications related to human-computer interaction or for understanding the users’ mood in several tasks. Typical machine learning approaches used towards this goal first extract a set of linguistic features from raw data, which are then used to train supervised learning models. Recently, Convolutional Neural Networks (CNNs), which unlike traditional approaches, learn to extract the appropriate features of their inputs, have also been applied as emotion recognition classifiers. In this work, we adopt a CNN architecture that uses spectrograms, extracted from audio signals as inputs and we propose data augmentation techniques to boost the classification performance. The proposed data augmentation approach includes noise addition, shifting of the audio signal, and changing its pitch or its speed. Experimental results indicate that the herein presented approach outperforms previous work which not use augmented data.

Download


Paper Citation


in Harvard Style

Paraskevopoulou G., Spyrou E. and Perantonis S. (2022). A Data Augmentation Approach for Improving the Performance of Speech Emotion Recognition. In Proceedings of the 19th International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, ISBN 978-989-758-591-3, pages 61-69. DOI: 10.5220/0011148000003289


in Bibtex Style

@conference{sigmap22,
author={Georgia Paraskevopoulou and Evaggelos Spyrou and Stavros Perantonis},
title={A Data Augmentation Approach for Improving the Performance of Speech Emotion Recognition},
booktitle={Proceedings of the 19th International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP,},
year={2022},
pages={61-69},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011148000003289},
isbn={978-989-758-591-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP,
TI - A Data Augmentation Approach for Improving the Performance of Speech Emotion Recognition
SN - 978-989-758-591-3
AU - Paraskevopoulou G.
AU - Spyrou E.
AU - Perantonis S.
PY - 2022
SP - 61
EP - 69
DO - 10.5220/0011148000003289