Speech Recognition for Minority Languages Using HuBERT and Model Adaptation
Tomohiro Hattori, Satoshi Tamura
2023
Abstract
In the field of speech recognition, models and datasets are becoming larger and larger. However, it is difficult to create large datasets for minority languages, which is an obstacle to improve the accuracy of speech recognition. In this study, we attempt to improve the recognition accuracy for minority languages, by utilizing models trained on large datasets of major language, followed by adapting its language model part to the target language. It is believed that deep-learning speech recognition models learn acoustic and language processing parts. Acoustic one may be common among any languages and has fewer differences than language one. Therefore, we investigate whether it is possible to build a recognizer by keeping acoustic processing learned in the other languages and adapting language processing to the minority language.
DownloadPaper Citation
in Harvard Style
Hattori T. and Tamura S. (2023). Speech Recognition for Minority Languages Using HuBERT and Model Adaptation. In Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-626-2, pages 350-355. DOI: 10.5220/0011682700003411
in Bibtex Style
@conference{icpram23,
author={Tomohiro Hattori and Satoshi Tamura},
title={Speech Recognition for Minority Languages Using HuBERT and Model Adaptation},
booktitle={Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2023},
pages={350-355},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011682700003411},
isbn={978-989-758-626-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Speech Recognition for Minority Languages Using HuBERT and Model Adaptation
SN - 978-989-758-626-2
AU - Hattori T.
AU - Tamura S.
PY - 2023
SP - 350
EP - 355
DO - 10.5220/0011682700003411