the strongest; by default, it means direct adjacency. Our lexicon contains morphemes
instead of words and they can search for any other morpheme inside or outside the
word which they are part of. In our approach, the only difference between
morphology and syntax is that in the syntactic subsystem, the program searches for
morphemes being in a certain grammatical relation in two different words. But
because of this, morphological and syntactic rank parameters are handled separately.
The lexicon is extendable in multiple ways. Apart from adding more words and
morphemes, new features can be added to the data structure, thus improving the
precision of the analysis. Third, the core lexicon can be extended via generating new
lexical elements by rules of generation, thus forming the extended lexicon. The core
lexicon contains all the basic properties of a morpheme including its default behavior
(e.g. argument structure of a verb). This applies primarily for intonation: the core
lexicon contains the property of “stress” (some words cannot be stressed at all while
others are stressed in a neutral sentence) – but it can be overwritten if the sentence is
not neutral. For instance, the entity with the property value “focus-stressed” is
generated automatically and inserted into the extended lexicon.
Obviously, it is not very effective to try out all of the possible intonation schemes
on a long sentence. However, there are two things which must be taken into account:
apart from morphemes (such as articles) which can never be stressed or focus-
stressed, intonation of arguments before and after the verb is constrained.
By default, the normal Hungarian argument position is after the verb. There are,
however, arguments which prefer being in the verbal modifier or the topic position.
These preferences should be stored in the core lexicon along with the arguments, as
was illustrated in (10) in Section 3. If the sentence does not fit into the preferred
schema, it is best to use heuristics to create the appropriate generator (11-12).
Prolog (a language for logical programming), including Visual Prolog 7 (which is
probably the most elaborate version of Prolog and which we use), usually allows
writing predicates which can be invoked “backwards” (generation (12) vs. acceptance
(14)). We suppose that the evaluation of predicates can be reversed. Based on this
principle, the future machine translator can be symmetrical. By using the keyword
anyflow, all arguments of the predicate can be used for both input and output,
allowing even entire programs to be executed “backwards”. The keywords
procedure, determ, multi, nondeterm etc. describe whether the
predicate can fail (a procedure must always succeed), have multiple backtrack points
(nondeterm) or not (determ).
Let us turn to the particular phases of parsing (acceptance).
Phase 0. Before taking intonation etc. into account, all words are segmented and
analyzed phonologically and morphologically, allowing the class of each word to be
determined. Practically, the last class-changing morpheme (derivative affix) has the
relevant “class” output feature. The input and output classes are stored in the core
lexicon of every class-changing morpheme.
Phase 1. During the actual syntactic analysis, arguments of a verb are searched
with rank 7 (weak). This is a bi-directional search: all non-predicative nominal
expressions look for their predicate, too, and if appropriate, the result is stored in the
memory of the computer. The adverbial adjuncts search for the verb with rank 7, too,
but if it is found, the extending generator must be invoked to modify the default
argument structure of the verb and insert the form into the extended lexicon (11a-b).
109