loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Leonardo Moraes 1 ; 2 ; Pedro Jardim 1 and Cristina Dutra Aguiar 1

Affiliations: 1 Department of Computer Science, University of São Paulo, São Carlos, Brazil ; 2 Machine Learning & Artificial Intelligence, Sinch, Stockholm, Sweden

Keyword(s): Question Answering, Big Data, Software Reference Architecture, Design Principles.

Abstract: Companies continuously produce several documents containing valuable information for users. However, querying these documents is challenging, mainly because of the heterogeneity and volume of documents available. In this work, we investigate the challenge of developing a Big Data Question Answering system, i.e., a system that provides a unified, reliable, and accurate way to query documents through naturally asked questions. We define a set of design principles and introduce BigQA, the first software reference architecture to meet these design principles. The architecture consists of high-level layers and is independent of programming language, technology, querying and answering algorithms. BigQA was validated through a pharmaceutical case study managing over 18k documents from Wikipedia articles and FAQ about Coronavirus. The results demonstrated the applicability of BigQA to real-world applications. In addition, we conducted 27 experiments on three open-domain datasets and compared the recall results of the well-established BM25, TF-IDF, and Dense Passage Retriever algorithms to find the most appropriate generic querying algorithm. According to the experiments, BM25 provided the highest overall performance. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.141.8.247

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Moraes, L.; Jardim, P. and Dutra Aguiar, C. (2023). Design Principles and a Software Reference Architecture for Big Data Question Answering Systems. In Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-648-4; ISSN 2184-4992, SciTePress, pages 57-67. DOI: 10.5220/0011842700003467

@conference{iceis23,
author={Leonardo Moraes. and Pedro Jardim. and Cristina {Dutra Aguiar}.},
title={Design Principles and a Software Reference Architecture for Big Data Question Answering Systems},
booktitle={Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2023},
pages={57-67},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011842700003467},
isbn={978-989-758-648-4},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - Design Principles and a Software Reference Architecture for Big Data Question Answering Systems
SN - 978-989-758-648-4
IS - 2184-4992
AU - Moraes, L.
AU - Jardim, P.
AU - Dutra Aguiar, C.
PY - 2023
SP - 57
EP - 67
DO - 10.5220/0011842700003467
PB - SciTePress