Authors:
Fatma Abdelhedi
1
;
Rym Jemmali
1
;
2
and
Gilles Zurfluh
2
Affiliations:
1
CBI2, Trimane, Paris, France
;
2
IRIT CNRS (UMR 5505), Toulouse University, Toulouse, France
Keyword(s):
Data Lake, Data Warehouse, NoSQL, Big Data, Relational Database, MDA, QVT.
Abstract:
The exponential growth of collected data, following the digital transformation of companies, has led to the evolution of databases towards Big Data. Our work is part of this context and concerns more particularly the mechanisms allowing to extract datasets from a Data Lake and to store them in a unique Data Warehouse. This one will allow to realize, in a second time, decisional analyses facilitated by the functionalities offered by the NoSQL systems (richness of the data structures, query language, access performances). This article proposes an extraction mechanism applied only to relational databases of the Data Lake. This mechanism relies on an automatic approach based on the Model Driven Architecture (MDA) which provides a set of schema transformation rules, formalized with the Query/View/Transform (QVT) language. From the physical schemas describing relational databases, we propose transformation rules that allow to generate a physical model of a Data Warehouse stored on a docume
nt-oriented NoSQL system (OrientDB). This paper presents the successive steps of the transformation process from the meta-modeling of the datasets to the application of the rules and algorithms. We provide an experimentation using a case study related to the health care field.
(More)