PRE-PROCESSING TASKS FOR RULE-BASED ENGLISH-KOREAN MACHINE TRANSLATION SYSTEM

Sung-Dong Kim

Abstract

This paper presents necessary pre-processing tasks for practical English-Korean machine translation. The pre-processing task consists of a problem that requires pre-processing and a solution for the problem. There are many differences between English and Korean, so it is difficult to resolve the differences using parsing and transfer rules. Also, source sentences often include non-word elements, such as parentheses, quotation marks, and list markers. In order to resolve the differences efficiently and make source sentences appropriate to translation system by arranging them, we propose pre-processing for source sentences. This paper studies various pre-processing tasks and classifies into several groups according to the time when the tasks are performed in English-Korean machine translation system. In experiment, we show the usefulness of the defined pre-processing tasks for generating better translation results.

References

  1. Kim S.-D. 2008. Study on Sentence Rewriting in EnglishKorean Machine Translation. Proceedings of Korean Computer Conference. In Korean.
  2. Kim S.-D., B.-T. Zhang, and Y. T. Kim. 2001. Learningbased Intrasentence Segmentation for Efficient Translation of Long Sentences. Machine Translation, 16(3), 151-174.
  3. Kim S.-D. and S.-H. Park. 2006. Comma Usage Classification for Improving Parsing Accuracy of Long Sentences in English-Korean Machine Translation. Proceedings of Korean Information Science Society. In Korean.
  4. Kim Y.-S. and Y. T. Kim. 1998. Semantic Implementation based on Extended Idiom for English to Korean Machine Translation. Journal of the Asia-Pacific Association for Machine Translation, 20, 23-39.
  5. Yoon S. H. 1993. Idiom-Based Efficient Parsing for English-Korean Machine Translation. Ph.D Dissertation of Seoul National University. In Korean.
  6. Yuh S. H., H. M. Jung, Y. S. Cahe, T.W. Kim and D.-I. Park. A Preprocessor for Practical English-to-Korean Machine Translation. 1996, Proceedings of Korea Language Engineering Research Society, 313-321. In Korean.
  7. Yuh S. H., H. M. Jung, T. W. Kim, D.-I. Park and J. Y. Seo. 1997. Preprocessing of Hyphenated Words for English-Korean Machine Translation. Proceedings of Korean Information Science Society, 24(2). 173-176. In Korean.
Download


Paper Citation


in Harvard Style

Kim S. (2011). PRE-PROCESSING TASKS FOR RULE-BASED ENGLISH-KOREAN MACHINE TRANSLATION SYSTEM . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 257-262. DOI: 10.5220/0003151702570262


in Bibtex Style

@conference{icaart11,
author={Sung-Dong Kim},
title={PRE-PROCESSING TASKS FOR RULE-BASED ENGLISH-KOREAN MACHINE TRANSLATION SYSTEM },
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={257-262},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003151702570262},
isbn={978-989-8425-40-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - PRE-PROCESSING TASKS FOR RULE-BASED ENGLISH-KOREAN MACHINE TRANSLATION SYSTEM
SN - 978-989-8425-40-9
AU - Kim S.
PY - 2011
SP - 257
EP - 262
DO - 10.5220/0003151702570262