On the contrary, the hierarchicalphrase translation model[7, 8], one of promisingformal
syntax approach does not need any manual annotation and has no worry on parsing pre-
cision: it uses only one nonterminal X to present all of possible syntactic structures of
a sentence. The X-nonterminal is such a formal variable from phrases that extracted by
phrase-based SMT model and may express some non-linguistic structures that is use-
ful to translation between two languages. Actually, the set of structures expressed by
hierarchical phrase model is a SMT-style syntactic formalism. This formalism carries
more frequency information rather than linguistic information for those phrase struc-
tures. The study about the respective effect of syntactic structures and phrases to the
translation quality will benefit from intensive exploring on the formalism.
Sparse research on the relation between the details of syntactic structures and trans-
lation qualityis partly imputed to the lack of automatic translation evaluationonsentence-
level. BLEU [9], the most popular method for automatic evaluation of machine transla-
tion system does not provide a sentence-level evaluation to identify which sentence is
better or worse, and only gives a whole evaluation to the translation quality of a system.
Many efforts are made to improve the approaches for automatic evaluation of machine
translation in recent years[10]. The work has been implemented to automatically gen-
erate not only the quality score of a translated sentence but check-points for diagnostic
evaluation[11]. We think the display and analysis for syntactic structures of translation
output is one alternative for diagnostic evaluation of SMT system performance. It will
reveal the reasons for the translation errors.
2 Hierarchical Phrase-based Translation(HPBT) Model[7, 8]
Formally, HPBT model is a weighted synchronous context-freegrammar which learned
from a parallel text without any syntactic annotations. Rules have the form X ⇒<
¯e||
¯
f>where ¯e and
¯
f are phrases consisting of terminal words and nonterminal symbol
X which presents phrasehierarchically, so HPBT modelemploys a generalizationof the
conventionalphrase-based translation model which does not allow hierarchical phrases.
Briefly, decoding of HPBT model is a CKY style parsing process. Given a French sen-
tence f, it finds the English yield of the single best derivation that has French yield
f.
3 The Framework of our Study
In our work, we will try to tickle the following problems:
– What is the decoding process of hierarchical phrase translation? How the process
affects the output of decoder (translation system)?
– What differences are there between hierarchical phrase structures and linguistic
phrase structures, especially those frequent phrase structures used in decoding pro-
cess?
– Whether these differences make mistakes for translation output? If yes, what are
the key positions for those mistakes in a translation sentence?
98