Speech Recognition for Minority Languages Using HuBERT and Model Adaptation

Tomohiro Hattori, Satoshi Tamura

2023

Abstract

In the field of speech recognition, models and datasets are becoming larger and larger. However, it is difficult to create large datasets for minority languages, which is an obstacle to improve the accuracy of speech recognition. In this study, we attempt to improve the recognition accuracy for minority languages, by utilizing models trained on large datasets of major language, followed by adapting its language model part to the target language. It is believed that deep-learning speech recognition models learn acoustic and language processing parts. Acoustic one may be common among any languages and has fewer differences than language one. Therefore, we investigate whether it is possible to build a recognizer by keeping acoustic processing learned in the other languages and adapting language processing to the minority language.

Download


Paper Citation


in Harvard Style

Hattori T. and Tamura S. (2023). Speech Recognition for Minority Languages Using HuBERT and Model Adaptation. In Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-626-2, pages 350-355. DOI: 10.5220/0011682700003411


in Bibtex Style

@conference{icpram23,
author={Tomohiro Hattori and Satoshi Tamura},
title={Speech Recognition for Minority Languages Using HuBERT and Model Adaptation},
booktitle={Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2023},
pages={350-355},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011682700003411},
isbn={978-989-758-626-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Speech Recognition for Minority Languages Using HuBERT and Model Adaptation
SN - 978-989-758-626-2
AU - Hattori T.
AU - Tamura S.
PY - 2023
SP - 350
EP - 355
DO - 10.5220/0011682700003411