Authors:
Fatma Abdelhedi
1
;
Rym Jemmali
1
;
2
and
Gilles Zurfluh
2
Affiliations:
1
CBI2, Trimane, Paris, France
;
2
IRIT CNRS (UMR 5505), Toulouse University, Toulouse, France
Keyword(s):
Data Lake, Data Warehouse, NoSQL Databases, Big Data.
Abstract:
Nowadays, there is a growing need to collect and analyze data from different databases. Our work is part of a medical application that must allow health professionals to analyze complex data for decision making. We propose mechanisms to extract data from a data lake and store them in a NoSQL data warehouse. This will allow us to perform, in a second time, decisional analysis facilitated by the features offered by NoSQL systems (richness of data structures, query language, access performances). In this paper, we present a process to ingest data from a Data Lake into a warehouse. The ingestion consists in (1) transferring NoSQL DBs extracted from the Data Lake into a single NoSQL DB (the warehouse), (2) merging so-called "similar" classes, and (3) converting the links into references between objects. An experiment has been performed for a medical application.