Improving Dysarthric Speech Intelligibility using Cycle-consistent Adversarial Training
Seung Hee Yang, Minhwa Chung, Minhwa Chung
2020
Abstract
Dysarthria is a motor speech impairment affecting millions of people. Dysarthric speech can be far less intelligible than those of non-dysarthric speakers, causing significant communication difficulties. The goal of our work is to develop a model for dysarthric to healthy speech conversion using Cycle-consistent GAN. Using 18,700 dysarthric and 8,610 healthy Korean utterances that were recorded for the purpose of automatic recognition of voice keyboard in a previous study, the generator is trained to transform dysarthric to healthy speech in the spectral domain, which is then converted back to speech. Objective evaluation using automatic speech recognition of the generated utterance on a held-out test set shows that the recognition performance is improved compared with the original dysarthic speech after performing adversarial training, as the absolute SER has been lowered by 33.4%. It demonstrates that the proposed GAN-based conversion method is useful for improving dysarthric speech intelligibility.
DownloadPaper Citation
in Harvard Style
Yang S. and Chung M. (2020). Improving Dysarthric Speech Intelligibility using Cycle-consistent Adversarial Training. In Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 4: BIOSIGNALS; ISBN 978-989-758-398-8, SciTePress, pages 308-313. DOI: 10.5220/0009163003080313
in Bibtex Style
@conference{biosignals20,
author={Seung Hee Yang and Minhwa Chung},
title={Improving Dysarthric Speech Intelligibility using Cycle-consistent Adversarial Training},
booktitle={Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 4: BIOSIGNALS},
year={2020},
pages={308-313},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009163003080313},
isbn={978-989-758-398-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 4: BIOSIGNALS
TI - Improving Dysarthric Speech Intelligibility using Cycle-consistent Adversarial Training
SN - 978-989-758-398-8
AU - Yang S.
AU - Chung M.
PY - 2020
SP - 308
EP - 313
DO - 10.5220/0009163003080313
PB - SciTePress