Authors:
Ahmad Magdy Eid
;
Nagwa El-Makky
and
Khaled Nagi
Affiliation:
Computer and Systems Engineering Department, Faculty of Engineering, Alexandria University, Alexandria and Egypt
Keyword(s):
Arabic Question Answering Systems, Machine Comprehension, Deep Learning, Machine Translation, Post-editing, Semi-supervised Learning.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Context Discovery
;
Evolutionary Computing
;
Information Extraction
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Mining Text and Semi-Structured Data
;
Soft Computing
;
Symbolic Systems
Abstract:
Machine Comprehension (MC) is a novel task of question answering (QA) discipline. MC tests the ability of the machine to read a text and comprehend its meaning. Deep learning in MC manages to build an end-to-end paradigm based on new neural networks to directly compute the deep semantic matching among question, answers, and the corresponding passage. Deep learning gives state-of-the-art performance results for English MC. The MC problem has not been addressed yet for the Arabic language due to the lack of Arabic MC datasets. This paper presents the first Arabic MC dataset that results from the translation of the SQuAD v1.1 dataset and applying a proposed approach that combines partial translation post-editing and semi-supervised learning. We intend to make this dataset publicly available for the research community. Furthermore, we use the resultant dataset to build an end-to-end deep learning Arabic MC models, which showed promising results.