NEW CONFIDENCE MEASURES FOR STATISTICAL MACHINE TRANSLATION

Sylvain Raybaud, Caroline Lavecchia, David Langlois, Kamel Smaïli

Abstract

A confidence measure is able to estimate the reliability of an hypothesis provided by a machine translation system. The problem of confidence measure can be seen as a process of testing: we want to decide whether the most probable sequence of words provided by the machine translation system is correct or not. In the following we describe several original word-level confidence measures for machine translation, based on mutual information, n-gram language model and lexical features language model. We evaluate how well they perform individually or together, and show that using a combination of confidence measures based on mutual information yields a classification error rate as low as 25.1% with an F-measure of 0.708.

References

  1. Akiba, Y., Sumita, E., Nakaiwa, H., Yamamoto, S., and Okuno, H. (2004). Using a mixture of n-best lists from multiple MT systems in rank-sum-based confidence measure for MT outputs. Proc. CoLing, pages 322-328.
  2. Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S., Goutte, C., Kulesza, A., Sanchis, A., and Ueffing, N. (2003). Confidence estimation for machine translation. final report, jhu/clsp summer workshop.
  3. Brown, P., Pietra, S., Pietra, V., and Mercer, R. (1994). The mathematic of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263-311.
  4. Culotta, A. and McCallum, A. (2004). Confidence estimation for information extraction. Proceedings of Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLT-NAACL).
  5. De Calmès, M. and Pérennou, G. (1998). Bdlex: a lexicon for spoken and written french. In Proceedings of 1st International Conference on Langage Resources & Evaluation.
  6. Duchateau, J., Demuynck, K., and Wambacq, P. (2002). Confidence scoring based on backward language models. Acoustics, Speech, and Signal Processing, 2002. Proceedings.(ICASSP'02). IEEE International Conference on, 1.
  7. Gandrabur, S., Foster, G., and Lapalme, G. (2006). Confidence estimation for NLP applications. ACM Transactions on Speech and Language Processing, 3(3):1-29.
  8. Guo, G., Huang, C., Jiang, H., and Wang, R. (2004). A comparative study on various confidence measures in large vocabulary speech recognition. 2004 International Symposium on Chinese Spoken Language Processing, pages 9-12.
  9. Koehn, P. (2005). Europarl: A parallel corpus for statistical machine translation. MT Summit, 5.
  10. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al. (2007). Moses: Open source toolkit for statistical machine translation. Proceedings of the Annual Meeting of the Association for Computational Linguistics, demonstation session.
  11. Lavecchia, C., Smaili, K., Langlois, D., and Haton, J. (2007). Using inter-lingual triggers for machine translation. Eighth conference INTERSPEECH.
  12. Mauclair, J. (2006). Mesures de confiance en traitement automatique de la parole et applications. PhD thesis, LIUM, Le Mans, France.
  13. Moore, R. C. (2005). Association-based bilingual word alignment. In Proceedings of the ACL Workshop on Building and Using Parallel Texts, Ann Arbor, Michigan, pp. 1-8.
  14. Och, F. (2000). Giza++ tools for training statistical translation models.
  15. Razik, J. (2004). Mesures de Confiance trame-synchrones et locales en reconnaissance automatique de la parole. PhD thesis, LORIA, Nancy, FRANCE.
  16. Smaïli, K., Jamoussi, S., Langlois, D., and Haton, J. (2004). Statistical feature language model. Proc. ICSLP.
  17. Stolcke, A. (2002). SRILM - an extensible language modeling toolkit. pages 901-904.
  18. Ueffing, N. and Ney, H. (2004). Bayes decision rule and confidence measures for statistical machine translation. pages 70-81. Springer.
  19. Ueffing, N. and Ney, H. (2005). Word-level confidence estimation for machine translation using phrase-based translation models. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 763-770.
  20. Uhrik, C. and Ward, W. (1997). Confidence Metrics Based on N-Gram Language Model Backoff Behaviors. In Fifth European Conference on Speech Communication and Technology. ISCA.
Download


Paper Citation


in Harvard Style

Raybaud S., Lavecchia C., Langlois D. and Smaïli K. (2009). NEW CONFIDENCE MEASURES FOR STATISTICAL MACHINE TRANSLATION . In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8111-66-1, pages 61-68. DOI: 10.5220/0001660600610068


in Bibtex Style

@conference{icaart09,
author={Sylvain Raybaud and Caroline Lavecchia and David Langlois and Kamel Smaïli},
title={NEW CONFIDENCE MEASURES FOR STATISTICAL MACHINE TRANSLATION},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2009},
pages={61-68},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001660600610068},
isbn={978-989-8111-66-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - NEW CONFIDENCE MEASURES FOR STATISTICAL MACHINE TRANSLATION
SN - 978-989-8111-66-1
AU - Raybaud S.
AU - Lavecchia C.
AU - Langlois D.
AU - Smaïli K.
PY - 2009
SP - 61
EP - 68
DO - 10.5220/0001660600610068