with more than 2000 classes and from the syntactic-
semantic dictionary. Since a dictionary word is
generally represented as n-ary function with
semantic constraints, its final semantics in the
sentence depends on the exact words in its context
within the sentence. The Tuzov’s syntactic-semantic
dictionary contains more than 150, 000 semantic
formulas.
4 TRANSLATIONAL
DICTIONARY
For creation of the MT system that is based on the
functional theory in Tuzov, 2004 we need to
translate the semantic dictionary onto the target
language. For the process automation we have
chosen the method described in the Section 2. For
the parallel corpus we have used list of parallel
sentences in Russian and English from the package
UMC Klyueva and Bojar, 2008. GIZA++ has
generated 1,3 million of phrase pairs, including
duplicates. Applying the semantic analyzer on each
Russian sentence we have obtained the semantic
alternatives of each of the words in the built pairs,
which correspond to their local context. As a result,
each word in the original translational phrase
dictionary was substituted with its semantic formula.
This solves the disambiguation on semantic level.
After removing the duplicates from the modified
dictionary, the final version of semantic-level
contextual translational dictionary has been built
with about 18, 000 word pairs. The dictionary is
subject to further clean up procedures and
enrichment. Here is an extract from the final
dictionary:
В Y1>HabU(Y1:,ПРЕД:Z1) \\ <149>--->Within
В Y1>Loc(Y1:,ВНУТРИ$12/313/05(ПРЕД:Z1)) \\
<146>--->at
В Y1>Loc(Y1:,Oper01(#,ПРЕД:Z1)) \\ <208>---
>In
В Y1>Loc(Y1:,ПРЕД:Z1) \\ <224>--->Throughout
...
НА Y1>Direkt(Y1:,ВЕРХ$12/141/05(ВИН:Z1)) \\
<67>--->at
НА Y1>Direkt(Y1:,РОД:Z1) \\ <100>--->on
НА Y1>Direkt(Y1:,РОД:Z1) \\ <69>--->for
...
ОБРАЗ (РОД:Z1) \\ <2>--->a way
ОБЩЕМИРОВОЙ
A1>Rel(A1:НЕЧТО$1,ПОЛНЫЙ$12/207/05(МИР$1227)
)
\\ <1>--->global
...
Each dictionary entry contains semantic formula
corresponding to the original Russian word and its
English analogue. One important property of the
dictionary is that its entries are context dependent.
This is provided by two circumstances: 1) each
sentence in Russian had its expert translation into
English and 2) each Russian word has been
attributed with a semantic formula that was a result
of semantic assembling of the corresponding
Russian sentence.
Consider one entry of the above extract in more
detail:
В Y1>HabU(Y
1
:,ПРЕД:Z
1
) \\
<149>--->Within
The semantic formula on the left of ---> sign has
several components: the word “В” (the Russian
preposition with a lot of meanings, roughly
corresponding to the English prepositions in, at,
within, into, to, of etc); the basis function HabU(x,y)
with its arguments (the function defines that x
possesses y), which in the case of HabU are Y
1
and
Z
1
(prepositional case); \\ sign followed by the order
number of the semantic alternative in the semantic-
syntactic dictionary.
Another example:
Y
1
>Loc(Y
1
:,ВНУТРИ$12/313/05(ПРЕД:Z
1
))
\\ <146>--->at
The second argument of Loc(x,y) basis function
(defines, that x is located in / at y) is itself a
function-preposition that takes one argument in
prepositional case. The second argument has the
name “ВНУТРИ” which is appended with
ontological class number $12/313/05. In this class
number, 12 refers to physical objects, 313 refers to
inhabited locality, 05 refers to physical position of
an object in prepositional case which is expressed as
adverb in Russian (“ВНУТРИ” is both “where?”
and “how?”).
5 EXPERIMENTAL MT SYSTEM
The semantic-level translational dictionary obtained
in the Section 4 forms the ground for the
experimental Russian to English MT system. In
order to achieve fluency on the target language side
and to reduce the noise in the automatically
generated semantic-level translation dictionary, we
have devised the Semantic Machine Translation
Model (SMTM) for translating sentence P onto the
target language L
2
:
METHOD FOR AN AUTOMATIC GENERATION OF A SEMANTIC-LEVEL CONTEXTUAL TRANSLATIONAL
DICTIONARY
417