GENERATING PHONEMES FROM WRITTEN THAI USING LEXICAL ANALYSIS BASED ON REGULAR EXPRESSIONS

Leo van Moergestel, John-Jules Meyer

Abstract

This document describes the approach and techniques used in software that has been developed to generate phonemes from written Thai. This software has been used to generate the phonetic transcription of Thai words in a Thai-Dutch dictionary. The most important part of this software is a lexical analyzer based on regular expressions for matching patterns in the Thai writing system. Because most software tools that use regular expressions are still based on the 7-bit ASCII set, a mapping of Thai characters to ASCII-characters has been used.

References

  1. Aroonmanakun, W. (2010). ling.arts.chula.ac.th/tts. Department of Linguistics, Chulalongkorn University, Bangkok.
  2. Aroonmanakun, W., Thapthong, N., Wattuya, P., Kasisopa, B., and Luksaneeyanawin, S. (2004). Automatic thai transcriptions of english words. Southeast Asian Linguistics Society Conference 14 (SEALS 14), Bangkok, Thailand, May 19-21, 2004.
  3. Campbell, S. and Shaweevongs, C. (1956). The Fundamentals of the Thai Language. Paragon Book Gallery, New York.
  4. Friedl, J. E. F. (2006). Mastering Regular Expressions, third ed. O'Reilly Media, Sebastopol, CA.
  5. Hopcroft, J. E., Motwani, R., and Ullman, J. (2006). Introduction to Automata Theory, Languages and Computation, third ed. Addison Wesley.
  6. Jucksriporn, C. and Sornil, O. (2011). A minimum clusterbased trigram statistical model for thai syllabification. CICLing (2)2011 p493-505.
  7. Levine, J. R. (2009). Flex and Bison. O'Reilly Media, Sebastopol, CA.
  8. Samuel, J. T. B. (1983). Introduction to Practical Phonetics. Summer Institute of Linguistics, England.
  9. Tingsabadh, M. R. K. and Abramson, A. S. (1993). Thai. Journal of the International Phonetic Association, 23, pp 24-28 doi:10.1017/S0025100300004746.
  10. van Moergestel, L. (1995a). Woordenboek Nederlands-Thai (Dutch-Thai dictionary). Nangsue.
  11. van Moergestel, L. (1995b). Woordenboek Thai-Nederlands (Thai-Dutch dictionary). Nangsue.
Download


Paper Citation


in Harvard Style

van Moergestel L. and Meyer J. (2012). GENERATING PHONEMES FROM WRITTEN THAI USING LEXICAL ANALYSIS BASED ON REGULAR EXPRESSIONS . In Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-95-9, pages 306-311. DOI: 10.5220/0003735603060311


in Bibtex Style

@conference{icaart12,
author={Leo van Moergestel and John-Jules Meyer},
title={GENERATING PHONEMES FROM WRITTEN THAI USING LEXICAL ANALYSIS BASED ON REGULAR EXPRESSIONS},
booktitle={Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2012},
pages={306-311},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003735603060311},
isbn={978-989-8425-95-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - GENERATING PHONEMES FROM WRITTEN THAI USING LEXICAL ANALYSIS BASED ON REGULAR EXPRESSIONS
SN - 978-989-8425-95-9
AU - van Moergestel L.
AU - Meyer J.
PY - 2012
SP - 306
EP - 311
DO - 10.5220/0003735603060311