(Hackett, 2005, p. 6). By inchoative usages he
understands occurrence of transitive verbs in passive
constructions, for example spro ‘to elaborate upon,’
but spro ba ‘elaborated (topic).’
An original semantic classification is provided
by B. Zeisler in “Relative Tense and Aspectual
Values in Tibetan Languages”. She divides all
Tibetan verbs into two groups: control action verbs
(which she also links to the concept of the Tibetan
traditional grammar rang-dbang-can gyi bya-tshig
‘self-powered action words’) and accidental event
verbs (or gzhan-dbang-can gyi bya-tshig ‘other
powered action words’) (Zeisler, 2004, p.250). She
further classifies these categories into subgroups
judging by dynamism, durativity and telicity of the
verbs.
In his classification, N. W. Hill defies the notion
of transitivity for Tibetan verbs altogether, as by his
reasoning, “accusative case has no meaning in
Tibetan”, the category of transitivity itself is not
sufficiently separated from valence, rection and
volition (Hill, 2010, p. xxii). Thus volition, or
control of the action by the agent, becomes one of
the major verbal categories for his system of
classification. Volition of the verb is deducted
judging by its lack or presence of imperative stem.
For the current study we mainly focus on the
category of transitivity, a complex graded
phenomenon that has grammatical manifestation in
Tibetan language in the form of verb valencies.
Volition showed no significant data for the current
research, other than the difference in the number of
verb stems, but it may become an interesting point
for later analysis.
Although the current research was influenced by
some of the mentioned ideas, we couldn’t use any of
the original classifications because of their
limitations in the number of described verbs,
structural inconsistencies. Most importantly, the
main purpose of the present study has been to create
a practice-oriented model that is based on corpus
data and works as part of the Tibetan natural
language processing (NLP).
We consider semantic analysis to be an essential
part of Tibetan NLP due to the ambiguity of both the
segmentation of Tibetan texts into morphemes (since
there are no word delimiters between word forms in
Tibetan writing) and the syntactic parsing. To
resolve the problem of morphosyntactic ambiguity a
computer-based linguistic ontology was developed.
In our project, the term “linguistic ontology” is
understood as a consistent classification of concepts
and relations between them that unite the meanings
of Tibetan linguistic units, including morphemes and
idiomatic morphemic complexes (Dobrov et al.,
2018-1, p. 340).
In the first generations of natural language
understanding systems (NLU systems), ontologies
were used as semantic dictionaries. In the early
1990s, several scholars already used the term
“ontology” in the most general sense, which allowed
linguistic thesauri to be considered as types of
ontologies. The WordNet computer thesaurus has
come to be called an “ontology,” and this trend has
only been growing in the majority of modern works.
Thesauri, including the WordNet, reflect more or
less specified semantic relations between lexical
units (words): synonymy, hyponymy, hypernymy,
antonymy, meronymy, holonymy, logical
entailment, the relation of an adjective to a noun,
etc. (for more information see (Miller, 1995;
Fellbaum, 1998). These relations can be used to
perform lexical disambiguation. Unfortunately, these
relations alone are not enough to solve the problem
of lexical or morphosyntactic ambiguity, especially
in Tibetan, since they do not reflect semantic
valencies (Dobrov, 2014, 114).
The Framenet database initiated by Charles J.
Fillmore covers most of English verbal vocabulary
(“FramNet,” 2020). The verb lexicon VerbNet
(“VerbNet,” 2020) also contains syntactic
descriptions and semantic restrictions for English
verbs. Both of them, however, cannot be considered
linguistic ontologies. Moreover, these resources do
not model the meanings of nouns in relation to
semantic classes created to describe verbal
valencies. The format for presenting information in
both resources is not universal and cannot be used to
model the meanings of lexical units related to other
parts of speech.
PropBank (“PropBank,” 2020) is another verb-
oriented resource that also remains close to the
syntactic level. Despite the fact that it contains
manually made semantic role annotation, it cannot
be used directly to perform semantic analysis.
There are few other resources in the world,
mainly for English and a few of other widely used
languages, that could be classified as linguistic
ontologies, the use of which for semantic
interpretation of syntactic structures is not
impossible, such as SUMO (Dobrov, 2014, p. 149)
and OpenCyc (Matuszek et al., 2006). Both
ontologies are universal and provide profound
classifications of concepts behind lexical meanings,
however, neither one, nor the other is in any way
oriented to verbs, or, moreover, in any way pretend
that it contains all the information about verb
valencies necessary for resolving ambiguity.