Authors:
Sylvain Raybaud
;
Caroline Lavecchia
;
David Langlois
and
Kamel Smaïli
Affiliation:
LORIA, France
Keyword(s):
Confidence measure, Statistical machine translation, Mutual information, Linguistic features language model, n-gram language model.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
Computational Intelligence
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Evolutionary Computing
;
Knowledge Discovery and Information Retrieval
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Machine Learning
;
Natural Language Processing
;
Pattern Recognition
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
Abstract:
A confidence measure is able to estimate the reliability of an hypothesis provided by a machine translation system. The problem of confidence measure can be seen as a process of testing: we want to decide whether the most probable sequence of words provided by the machine translation system is correct or not. In the following we describe several original word-level confidence measures for machine translation, based on mutual information, n-gram language model and lexical features language model. We evaluate how well they perform individually or together, and show that using a combination of confidence measures based on mutual information yields a classification error rate as low as 25.1% with an F-measure of 0.708.