Unsupervised Evaluation of Human Translation Quality

Yi Zhou, Danushka Bollegala


Even though machine translation (MT) systems have reached impressive performances in cross-lingual translation tasks, the quality of MT is still far behind professional human translations (HTs) due to the complexity in natural languages, especially for terminologies in different domains. Therefore, HTs are still widely demanded in practice. However, the quality of HT is also imperfect and vary significantly depending on the experience and knowledge of the translators. Evaluating the quality of HT in an automatic manner has faced many challenges. Although bilingual speakers are able to assess the translation quality, manually checking the accuracy of translations is expensive and time-consuming. In this paper, we propose an unsupervised method to evaluate the quality of HT without requiring any labelled data. We compare a range of methods for automatically grading HTs and observe the Bidirectional Minimum Word Mover’s distance (BiMWMD) to produce gradings that correlate well with humans.


Paper Citation