Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet
Laine Strankale, Madara Stāde
2022
Abstract
Latvian WordNet is a resource where word senses are connected based on their semantic relationships. The manual construction of a high-quality core Latvian WordNet is currently underway. However, text processing tasks require broad coverage, therefore, this work aims to extend the wordnet by automatically linking additional word senses in the Latvian online dictionary Tēzaurs.lv and aligning them to the English-language Princeton WordNet (PWN). Our method only needs translation data, sense definitions and usage examples to compare it to PWN using pretrained word embeddings and sBERT. As a result, 57 927 interlanguage links were found that can potentially be added to Latvian WordNet, with an accuracy of 80% for nouns, 56% for verbs, 67% for adjectives and 66% for adverbs.
DownloadPaper Citation
in Harvard Style
Strankale L. and Stāde M. (2022). Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI, ISBN 978-989-758-547-0, pages 478-485. DOI: 10.5220/0011006000003116
in Bibtex Style
@conference{nlpinai22,
author={Laine Strankale and Madara Stāde},
title={Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet},
booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,},
year={2022},
pages={478-485},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011006000003116},
isbn={978-989-758-547-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,
TI - Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet
SN - 978-989-758-547-0
AU - Strankale L.
AU - Stāde M.
PY - 2022
SP - 478
EP - 485
DO - 10.5220/0011006000003116