contains both interlanguage links and links to PWN.
Then we described the automatic extension technique
used to increase the coverage of Latvian WordNet. In
it we attempt to align the Latvian dictionary T
¯
ezaurs
with PWN by, first, translating senses to English, sec-
ond, constructing a vector representation for each Lat-
vian sense and PWN synset using word embeddings
and sBERT, and, third, score links and filter them us-
ing a threshold.
The results were evaluated in terms of precision
and coverage by, first, comparing them to core Lat-
vian WordNet and, second, manually checking a sam-
ple of 400 senses. Ultimately we found 57 927 new
sense-synset pairs with precision of 80% for nouns,
56% for verbs, 67% for adjectives and 66% for ad-
verbs.
The focus of this paper was on how to extract in-
formation from other wordnets, namely, PWN. How-
ever, as mentioned in Section 2, it is also possible
to extract interlanguage link information from a re-
source in the same language, if such a resource exists.
T
¯
ezaurs sense glosses contain some textual informa-
tion about word formation and synonyms. However,
this data has not yet been processed and the existing
errors fixed, therefore, we have chosen to exclude it
for now but it is a fruitful future research direction
which could be explored.
We have shown that automatic extension of a
wordnet requiring only a target-language dictionary
and translation resources is possible. Our results will
be added to the current core Latvian WordNet and
merged into the online resource T
¯
ezaurs, where they
will be available also to the public. Additionally, the
data will be used to develop word sense disambigua-
tion (WSD) capabilities for Latvian.
ACKNOWLEDGEMENTS
This research work was supported by the Latvian
Council of Science, project “Latvian WordNet and
word sense disambiguation”, project No. LZP-
2019/1-0464.
REFERENCES
Bond, F., Isahara, H., Kanzaki, K., and Uchimoto, K.
(2008). Boot-strapping a wordnet using multiple ex-
isting wordnets. In LREC.
Bond, F. and Paik, K. (2012). A survey of wordnets and
their licenses. In Proceedings of the 6th Global Word-
Net Conference (GWC 2012), pages 64–71.
Fellbaum, C. (1998). Wordnet: An electronic lexical
databasemit.
Fi
ˇ
ser, D. and Sagot, B. (2015). Constructing a poor man’s
wordnet in a resource-rich world. Language Re-
sources and Evaluation, 49(3):601–635.
Khodak, M., Risteski, A., Fellbaum, C., and Arora, S.
(2017). Automated wordnet construction using word
embeddings. In Proceedings of the 1st Workshop on
Sense, Concept and Entity Representations and their
Applications, pages 12–23.
Lam, K. N., Al Tarouti, F., and Kalita, J. (2014). Auto-
matically constructing wordnet synsets. In Proceed-
ings of the 52nd Annual Meeting of the Association for
Computational Linguistics (Volume 2: Short Papers),
pages 106–111.
Lind
´
en, K. and Carlson, L. (2010). Finnwordnet–finnish
wordnet by translation. LexicoNordica–Nordic Jour-
nal of Lexicography, 17:119–140.
Lokmane, I. and Rituma, L. (2021). Verba noz
¯
ımju
no
ˇ
sk¸ir
ˇ
sana: teorija un prakse verb sense distinction:
theory and practice. Valoda: Noz
¯
ıme un forma 12.
R
¯
ıga: LU Akad
¯
emiskais apg
¯
ads.
Loukachevitch, N. and Gerasimova, A. (2019). Linking rus-
sian wordnet ruwordnet to wordnet. In Proceedings of
the 10th Global Wordnet Conference, pages 64–71.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013).
Efficient estimation of word representations in vector
space. arXiv preprint arXiv:1301.3781.
Montazery, M. and Faili, H. (2010). Automatic persian
wordnet construction. In Coling 2010: Posters, pages
846–850.
Palmer, M., Babko-Malaya, O., and Dang, H. T. (2004).
Different sense granularities for different applica-
tions. In Proceedings of the 2nd International Work-
shop on Scalable Natural Language Understanding
(ScaNaLU 2004) at HLT-NAACL 2004, pages 49–56.
Pedersen, B. S., Nimb, S., Asmussen, J., Sørensen, N. H.,
Trap-Jensen, L., and Lorentzen, H. (2009). Dannet:
the challenge of compiling a wordnet for danish by
reusing a monolingual dictionary. Language resources
and evaluation, 43(3):269–299.
Pedersen, B. S., Nimb, S., Olsen, I. R., and Olsen, S. (2019).
Merging DanNet with Princeton Wordnet. In Proceed-
ings of the 10th Global Wordnet Conference, pages
125–134, Wroclaw, Poland. Global Wordnet Associ-
ation.
Postma, M., van Miltenburg, E., Segers, R., Schoen, A., and
Vossen, P. (2016). Open dutch wordnet. In Proceed-
ings of the 8th Global WordNet Conference (GWC),
pages 302–310.
Reimers, N. and Gurevych, I. (2019). Sentence-bert: Sen-
tence embeddings using siamese bert-networks. arXiv
preprint arXiv:1908.10084.
Sagot, B. and Fi
ˇ
ser, D. (2012). Automatic extension of wolf.
In GWC2012-6th International Global Wordnet Con-
ference.
Tufis, D., Cristea, D., and Stamou, S. (2004). Balkanet:
Aims, methods, results and perspectives. a general
overview. Romanian Journal of Information science
and technology, 7(1-2):9–43.
Vossen, P. (1998). Introduction to eurowordnet. In Eu-
roWordNet: A multilingual database with lexical se-
mantic networks, pages 1–17. Springer.
Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet
485