Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets
Dávid Sztahó, Miklós Gábriel Tulics, Jinzi Qi, Hugo Van Hamme, Klára Vicsi
2022
Abstract
Dysphonic voices can be detected using features derived from speech samples. Works aiming at this topic usually deal with mono-lingual experiments using a speech dataset in a single language. The present paper targets extension to a cross-lingual scenario. A Hungarian and a Dutch speech dataset are used. Automatic binary separation of normal and dysphonic speech and dysphonia severity level estimation are performed and evaluated by various metrics. Various speech features are calculated specific to an entire speech sample and to a given phoneme. Feature selection and model training is done on Hungarian and evaluated on the Dutch dataset. The results show that cross-lingual detection of dysphonic speech may be possible on the applied corpora. It was found that cross-lingual detection of dysphonic speech is indeed possible with acceptable generalization ability, while features calculated on phoneme-level parts of speech can improve the results. Considering cross-lingual classification test sets, 0.86 and 0.81 highest F1-scores can be achieved for feature sets with the vowel /E/ included and excluded, respectively and 0.72 and 0.65 highest Pearson correlations can be achieved or severity prediction using features sets with the vowel /E/ included and excluded, respectively.
DownloadPaper Citation
in Harvard Style
Sztahó D., Tulics M., Qi J., Van Hamme H. and Vicsi K. (2022). Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 4: BIOSIGNALS; ISBN 978-989-758-552-4, SciTePress, pages 215-220. DOI: 10.5220/0010890200003123
in Bibtex Style
@conference{biosignals22,
author={Dávid Sztahó and Miklós Gábriel Tulics and Jinzi Qi and Hugo Van Hamme and Klára Vicsi},
title={Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 4: BIOSIGNALS},
year={2022},
pages={215-220},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010890200003123},
isbn={978-989-758-552-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 4: BIOSIGNALS
TI - Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets
SN - 978-989-758-552-4
AU - Sztahó D.
AU - Tulics M.
AU - Qi J.
AU - Van Hamme H.
AU - Vicsi K.
PY - 2022
SP - 215
EP - 220
DO - 10.5220/0010890200003123
PB - SciTePress