A Challenging Data Set for Evaluating Part-of-Speech Taggers
Mattias Wahde, Minerva Suvanto, Marco Vedova
2024
Abstract
We introduce a novel, challenging test set for part-of-speech (POS) tagging, consisting of sentences in which only one word is POS-tagged. First derived from Wiktionary, and then manually curated, it is intended as an out-of-sample test set for POS taggers trained over larger data sets. Sentences were selected such that at least one of four standard benchmark taggers would incorrectly tag the word under consideration for a given sentence, thus identifying challenging instances of POS tagging. Somewhat surprisingly, we find that the benchmark taggers often fail on rather straightforward instances of POS tagging, and we analyze these failures in some detail. We also compute the performance of a state-of-the-art DNN-based POS tagger over our set, obtaining an accuracy of around 0.87 for this out-of-sample test, far below its reported performance in the literature. Also for this tagger, we find instances of failure even in rather simple cases.
DownloadPaper Citation
in Harvard Style
Wahde M., Suvanto M. and Vedova M. (2024). A Challenging Data Set for Evaluating Part-of-Speech Taggers. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 79-86. DOI: 10.5220/0012307200003636
in Bibtex Style
@conference{icaart24,
author={Mattias Wahde and Minerva Suvanto and Marco Vedova},
title={A Challenging Data Set for Evaluating Part-of-Speech Taggers},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2024},
pages={79-86},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012307200003636},
isbn={978-989-758-680-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - A Challenging Data Set for Evaluating Part-of-Speech Taggers
SN - 978-989-758-680-4
AU - Wahde M.
AU - Suvanto M.
AU - Vedova M.
PY - 2024
SP - 79
EP - 86
DO - 10.5220/0012307200003636
PB - SciTePress