Authors:
Aleksei Dobrov
1
;
Anastasia Dobrova
2
;
Maria Smirnova
1
and
Nikolay Soms
2
Affiliations:
1
Saint-Petersburg State University, Saint-Petersburg and Russia
;
2
LLC “AIIRE”, Saint-Petersburg and Russia
Keyword(s):
Tibetan Language, Compounds, Computer Ontology, Tibetan Corpus, Natural Language Processing, Corpus Linguistics, Immediate Constituents.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
Data Engineering
;
Enterprise Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Natural Language Processing
;
Ontologies and the Semantic Web
;
Ontology Engineering
;
Pattern Recognition
;
Symbolic Systems
Abstract:
This article provides a consistent formal grammatical and ontological description of the model of the Tibetan compounds system, developed and used for automatic syntactic and semantic analysis of Tibetan texts, on the material of a hand-verified corpus. This model covers all types of Tibetan compounds, which were previously introduced by other authors, and introduces a number of new classes of compounds, taking into account their derivation, structure and semantics. The article describes the tools used for ontological modeling of Tibetan compounds; special attention is paid to the problem of modeling the semantics of verbs and verbal compounds. Nominal and verbal compounds are considered separately, it is noted that the importance of verbal compounds for the Tibetan language system is not less than that of nominal compounds. The statistical data on the absolute frequency distribution of the use of compounds of different types in the current version of the corpus annotation and on the
amounts of ontology concepts associated with each class of compounds are given.
(More)