Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases
Fatma Abdelhedi, Rym Jemmali, Rym Jemmali, Gilles Zurfluh
2021
Abstract
The exponential growth of collected data, following the digital transformation of companies, has led to the evolution of databases towards Big Data. Our work is part of this context and concerns more particularly the mechanisms allowing to extract datasets from a Data Lake and to store them in a unique Data Warehouse. This one will allow to realize, in a second time, decisional analyses facilitated by the functionalities offered by the NoSQL systems (richness of the data structures, query language, access performances). This article proposes an extraction mechanism applied only to relational databases of the Data Lake. This mechanism relies on an automatic approach based on the Model Driven Architecture (MDA) which provides a set of schema transformation rules, formalized with the Query/View/Transform (QVT) language. From the physical schemas describing relational databases, we propose transformation rules that allow to generate a physical model of a Data Warehouse stored on a document-oriented NoSQL system (OrientDB). This paper presents the successive steps of the transformation process from the meta-modeling of the datasets to the application of the rules and algorithms. We provide an experimentation using a case study related to the health care field.
DownloadPaper Citation
in Harvard Style
Abdelhedi F., Jemmali R. and Zurfluh G. (2021). Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases. In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 3: KMIS; ISBN 978-989-758-533-3, SciTePress, pages 64-72. DOI: 10.5220/0010690600003064
in Bibtex Style
@conference{kmis21,
author={Fatma Abdelhedi and Rym Jemmali and Gilles Zurfluh},
title={Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases},
booktitle={Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 3: KMIS},
year={2021},
pages={64-72},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010690600003064},
isbn={978-989-758-533-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 3: KMIS
TI - Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases
SN - 978-989-758-533-3
AU - Abdelhedi F.
AU - Jemmali R.
AU - Zurfluh G.
PY - 2021
SP - 64
EP - 72
DO - 10.5220/0010690600003064
PB - SciTePress