Disambiguating Confusion Sets in a Language with Rich Morphology

Steinunn Friðriksdóttir, Anton Ingason

2020

Abstract

The processing of strings which are semantically distinct but can be easily confused with each other, often on account of being pronounced identically, is a prime example of context dependency in Natural Language Processing. This problem arises when a system needs to distinguish whether a bank is a ‘river bank’ or a ‘financial institution’ and it also challenges systems for context-sensitive spelling and grammar correction because pairs like their/there and I/me are one common source of issues that such systems must address. In practice, this type of context-dependency can be especially prominent in languages with rich morphology where large paradigms of inflected word forms lead to a proliferation of such confusion sets. In this paper, we present our novel confusion set corpus for Icelandic as well as our findings from an experiment that uses well-known classification algorithms to disambiguate confusion sets that appear in our corpus.

Download


Paper Citation


in Harvard Style

Friðriksdóttir S. and Ingason A. (2020). Disambiguating Confusion Sets in a Language with Rich Morphology. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI, ISBN 978-989-758-395-7, pages 446-451. DOI: 10.5220/0009371504460451


in Bibtex Style

@conference{nlpinai20,
author={Steinunn Friðriksdóttir and Anton Ingason},
title={Disambiguating Confusion Sets in a Language with Rich Morphology},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,},
year={2020},
pages={446-451},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009371504460451},
isbn={978-989-758-395-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,
TI - Disambiguating Confusion Sets in a Language with Rich Morphology
SN - 978-989-758-395-7
AU - Friðriksdóttir S.
AU - Ingason A.
PY - 2020
SP - 446
EP - 451
DO - 10.5220/0009371504460451