Formal Grammatical and Ontological Modeling of Corpus Data on Tibetan Compounds

Aleksei Dobrov, Anastasia Dobrova, Maria Smirnova, Nikolay Soms


This article provides a consistent formal grammatical and ontological description of the model of the Tibetan compounds system, developed and used for automatic syntactic and semantic analysis of Tibetan texts, on the material of a hand-verified corpus. This model covers all types of Tibetan compounds, which were previously introduced by other authors, and introduces a number of new classes of compounds, taking into account their derivation, structure and semantics. The article describes the tools used for ontological modeling of Tibetan compounds; special attention is paid to the problem of modeling the semantics of verbs and verbal compounds. Nominal and verbal compounds are considered separately, it is noted that the importance of verbal compounds for the Tibetan language system is not less than that of nominal compounds. The statistical data on the absolute frequency distribution of the use of compounds of different types in the current version of the corpus annotation and on the amounts of ontology concepts associated with each class of compounds are given.


