Speech Detection of Real-Time MRI Vocal Tract Data

Jasmin Menges, Johannes Walter, Jasmin Bächle, Klemens Schnattinger

2023

Abstract

This paper investigates the potential of Deep Learning in the area of speech production. The purpose is to study whether algorithms are able to classify the spoken content based only on images of the oral region. With the real-time MRI data of Lim et al. more detailed insights into the speech production of the vocal tract could be obtained. In this project, the data was applied to recognize spoken letters from tongue movements using a vector-based image detection approach. In addition, to generate more data, randomization was applied. The pixel vectors of a video clip during which a certain letter was spoken could then be passed into a Deep Learning model. For this purpose, the neural networks LSTM and 3D-CNN were used. It has been proven that it is possible to classify letters with an accuracy of 93% using a 3D-CNN model.

Download


Paper Citation


in Harvard Style

Menges J., Walter J., Bächle J. and Schnattinger K. (2023). Speech Detection of Real-Time MRI Vocal Tract Data. In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN 978-989-758-671-2, SciTePress, pages 182-187. DOI: 10.5220/0012155600003598


in Bibtex Style

@conference{kdir23,
author={Jasmin Menges and Johannes Walter and Jasmin Bächle and Klemens Schnattinger},
title={Speech Detection of Real-Time MRI Vocal Tract Data},
booktitle={Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2023},
pages={182-187},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012155600003598},
isbn={978-989-758-671-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Speech Detection of Real-Time MRI Vocal Tract Data
SN - 978-989-758-671-2
AU - Menges J.
AU - Walter J.
AU - Bächle J.
AU - Schnattinger K.
PY - 2023
SP - 182
EP - 187
DO - 10.5220/0012155600003598
PB - SciTePress