Authors:
Marek Medveď
and
Aleš Horák
Affiliation:
Faculty of Informatics and Masaryk University, Czech Republic
Keyword(s):
Question Answering, Word Embedding, Word2vec, AQA, Simple Question Answering Database, SQAD.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computational Intelligence
;
Evolutionary Computing
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Knowledge Discovery and Information Retrieval
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Machine Learning
;
Methodologies and Methods
;
Natural Language Processing
;
Neural Networks
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
;
Theory and Methods
Abstract:
The Automatic Question Answering, or AQA, system is a representative of open domain QA systems, where
the answer selection process leans on syntactic and semantic similarities between the question and the answering
text snippets. Such approach is specifically oriented to languages with fine grained syntactic and
morphologic features that help to guide the correct QA match. In this paper, we present the latest results
of the AQA system with new word embedding criteria implementation. All AQA processing steps (question
processing, answer selection and answer extraction) are syntax-based with advanced scoring obtained by a
combination of several similarity criteria (TF-IDF, tree distance, ...). Adding the word embedding parameters
helped to resolve the QA match in cases, where the answer is expressed by semantically near equivalents. We
describe the design and implementation of the whole QA process and provide a new evaluation of the AQA
system with the word embedding criteria
measured with an expanded version of Simple Question-Answering
Database, or SQAD, with more than 3,000 question-answer pairs extracted from the Czech Wikipedia.
(More)