Speech Detection of Real-Time MRI Vocal Tract Data
Jasmin Menges, Johannes Walter, Jasmin Bächle, Klemens Schnattinger
2023
Abstract
This paper investigates the potential of Deep Learning in the area of speech production. The purpose is to study whether algorithms are able to classify the spoken content based only on images of the oral region. With the real-time MRI data of Lim et al. more detailed insights into the speech production of the vocal tract could be obtained. In this project, the data was applied to recognize spoken letters from tongue movements using a vector-based image detection approach. In addition, to generate more data, randomization was applied. The pixel vectors of a video clip during which a certain letter was spoken could then be passed into a Deep Learning model. For this purpose, the neural networks LSTM and 3D-CNN were used. It has been proven that it is possible to classify letters with an accuracy of 93% using a 3D-CNN model.
DownloadPaper Citation
in Harvard Style
Menges J., Walter J., Bächle J. and Schnattinger K. (2023). Speech Detection of Real-Time MRI Vocal Tract Data. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN 978-989-758-671-2, SciTePress, pages 182-187. DOI: 10.5220/0012155600003598
in Bibtex Style
@conference{kdir23,
author={Jasmin Menges and Johannes Walter and Jasmin Bächle and Klemens Schnattinger},
title={Speech Detection of Real-Time MRI Vocal Tract Data},
booktitle={Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2023},
pages={182-187},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012155600003598},
isbn={978-989-758-671-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Speech Detection of Real-Time MRI Vocal Tract Data
SN - 978-989-758-671-2
AU - Menges J.
AU - Walter J.
AU - Bächle J.
AU - Schnattinger K.
PY - 2023
SP - 182
EP - 187
DO - 10.5220/0012155600003598
PB - SciTePress