Masry: A Text-to-Speech System for the Egyptian Arabic

Ahmed Azab, Ahmed Zaky, Ahmed Zaky, Tetsuji Ogawa, Walid Gomaa, Walid Gomaa

2023

Abstract

This paper presents the improvement and evaluation of Masry, an end-to-end system planned to synthesize Egyptian Arabic speech. The proposed approach leverages the capable Tacotron speech synthesis models, counting Tacotron1 and Tacotron2, and integrated with progressed vocoders – Griffin-Lim for Tacotron1 and HiFi-GAN for Tacotron2. By synthesizing waveforms from mel-spectrograms, Masry offers a comprehensive solution for generating natural and expressive Egyptian Arabic speech. To train and validate our system, we construct a dataset including a male speaker describing standard composing pieces and news content in Egyptian Arabic. The sampling rate of recorded data is 44100 Hz, guaranteeing constancy and richness within the synthesized speech output. The execution of our framework was fastidiously assessed through different measurements, with a specific center on the Mean Opinion Score (MOS). The experimental results demonstrated the prevalence of Tacotron2 over Tacotron1, yielding a MOS of 4.48 compared to 3.64. This emphasizes the system’s capacity to capture and duplicate the nuances of Egyptian Arabic speech more effectively. Besides, The assessment extended to include fundamental measurements such as word and character error rates (WER and CER). These metrics give a quantitative appraisal of the precision and exactness of the synthesized speech.

Download


Paper Citation


in Harvard Style

Azab A., Zaky A., Ogawa T. and Gomaa W. (2023). Masry: A Text-to-Speech System for the Egyptian Arabic. In Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO; ISBN 978-989-758-670-5, SciTePress, pages 219-226. DOI: 10.5220/0012244300003543


in Bibtex Style

@conference{icinco23,
author={Ahmed Azab and Ahmed Zaky and Tetsuji Ogawa and Walid Gomaa},
title={Masry: A Text-to-Speech System for the Egyptian Arabic},
booktitle={Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO},
year={2023},
pages={219-226},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012244300003543},
isbn={978-989-758-670-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO
TI - Masry: A Text-to-Speech System for the Egyptian Arabic
SN - 978-989-758-670-5
AU - Azab A.
AU - Zaky A.
AU - Ogawa T.
AU - Gomaa W.
PY - 2023
SP - 219
EP - 226
DO - 10.5220/0012244300003543
PB - SciTePress