Authors:
Marek Medved'
;
Radoslav Sabol
and
Aleš Horák
Affiliation:
Natural Language Processing Centre, Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Keyword(s):
Question Answering, Question Classification, Answer Classification, Czech, Simple Question Answering Database, SQAD.
Abstract:
Question answering systems have improved greatly during the last five years by employing architectures of deep neural networks such as attentive recurrent networks or transformer-based networks with pretrained contextual information. In this paper, we present the results and detailed analysis of experiments with the largest question answering benchmark dataset for the Czech language. The best results evaluated in the text reach the accuracy of 72 %, which is a 4 % improvement to the previous best result. We also introduce the newest version of the Czech Question Answering benchmark dataset SQAD 3.0, which was substantially extended to more than 13,000 question-answer pairs, and we report the first answer selection results on this dataset which indicate that the size of the training data is important for the task.