A NOVEL WI DECODER FOR THE SEGMENTED FRAME DECODING IN THE TEXT-TO-SPEECH SYNTHESIZER

Kyungjin Byun, Nak-Woong Eum, Hee-Bum Jung

2010

Abstract

The implementation of a high quality text-to-speech (TTS) requires huge storage space for a large number of speech segments, because current TTS synthesizers are mostly based on a technique known as synthesis by concatenation. In order to compress the database in the TTS system, the use of speech coders would be an efficient solution. Waveform interpolation (WI) has been shown to be an efficient speech coding algorithm to provide high quality speech at low bit rates. However, the speech coder used in a TTS system has to be different from the one used in communication applications because the decoder in the TTS system should have an ability to decode segmented frames. In this paper, we propose a novel WI decoder scheme that can handle the segmented frame decoding. The decoder can reconstruct a good quality speech even at the concatenation boundary, which is effective for the TTS system based on a synthesis by concatenation.

References

  1. Kleijn, W. B., Haagen, L. J., 1995. Waveform interpolation for coding and synthesis: Speech coding and synthesis. Elsevier Science B. V.
  2. Kleijn, W. B., 1993. Encoding Speech Using Prototype Waveforms. IEEE Tans. on Speech and Audio Processing, 1(4), pp. 386-399.
  3. Gottesman, O., Gersho, A., 2001. Enhanced Waveform Interpolative Coding at Low Bit-Rate. IEEE Trans. on Speech and Audio Processing, 9(8), pp. 786-798.
  4. Ritz, C. H., Burnett, I. S., Lukasiak, J., 2002. Extending waveform interpolation to wideband speech coding. Proc. IEEE workshop on Speech Coding, pp. 32-34.
  5. Ritz, C. H., Burnett, I. S., Lukasiak, J., 2003. Low bit rate wideband WI speech coding. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 804-807.
  6. Vercken, O., Pierret, N., Dutoit, T., Pagel, V., Malfrere, F., 1997. New Techniques for the Compression of Synthesizer Databases. Proc. IEEE Int. Symp. on Circuits and Systems, pp. 2641-2644.
Download


Paper Citation


in Harvard Style

Byun K., Eum N. and Jung H. (2010). A NOVEL WI DECODER FOR THE SEGMENTED FRAME DECODING IN THE TEXT-TO-SPEECH SYNTHESIZER . In Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010) ISBN 978-989-8425-19-5, pages 151-154. DOI: 10.5220/0002978901510154


in Bibtex Style

@conference{sigmap10,
author={Kyungjin Byun and Nak-Woong Eum and Hee-Bum Jung},
title={A NOVEL WI DECODER FOR THE SEGMENTED FRAME DECODING IN THE TEXT-TO-SPEECH SYNTHESIZER},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010)},
year={2010},
pages={151-154},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002978901510154},
isbn={978-989-8425-19-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010)
TI - A NOVEL WI DECODER FOR THE SEGMENTED FRAME DECODING IN THE TEXT-TO-SPEECH SYNTHESIZER
SN - 978-989-8425-19-5
AU - Byun K.
AU - Eum N.
AU - Jung H.
PY - 2010
SP - 151
EP - 154
DO - 10.5220/0002978901510154