Apache-Drill system does not display a complete
model for data stored under MongoDB. Indeed, only
the names of the collections and the fields of the first
level are displayed, which means that the nested fields
do not appear. However, our process gives, for each
collection, its name as well as all the names and types
of the fields, that these are atomic or complex.
On the other hand, for the three research works
cited in the state of the art, the proposed solutions do
not consider the links between collections. However,
in the application presented in section 2 (see Figure
1), the links between collections are useful for
treatments and requests made by doctors. Thus, our
process proposes a solution to take into consideration
the links between the collections and formalize them
in the resulting data model.
Finally, it should be emphasized that our process
is based on the MDA architecture. This brings both a
standard formalism of description of the
transformation rules and a way of automating the
sequences of transformations.
7 CONCLUSION AND
PERSPECTIVES
Our work is part of the evolution of databases towards
Big Data. Our studies are currently focused on the
extraction mechanisms of the data model from a
NoSQL database in order to facilitate the expression
of queries.
In this article, we have proposed an automatic
process to extract the physical model from a
document-oriented NoSQL database. This process is
based on the Model Driven Architecture (MDA)
architecture that provides a formal framework for
automating model transformations. Our process
generates a NoSQL physical model from a NoSQL
database by applying a sequence of transformations
formalized with the QVT standard. The returned
model describes the structure of the collections that
make up the database and their links. We have
experimented our process on the case study in
healthcare filed. This case study concerns scientific
programs for monitoring patients having serious
diseases; the database is stored on MongoDB system.
As future work, we plan to study the update of the
data model as the database is being exploited. Indeed,
the data volume can reach several terabytes, the
generation of the model requires the scan of the entire
database. It is therefore not possible for a user to
restart the process each time he wishes to express a
new query.
REFERENCES
Angadi, A. B., Angadi, A. B., & Gull, K. C. (2013). Growth
of New Databases & Analysis of NOSQL Datastores.
International Journal of Advanced Research in
Computer Science and Software Engineering, 3, 1307-
1319.
BigIntegrator (2018). IBM BigIntegrate.
https://www.ibm.com/us-en/marketplace/ibm-
biginsights-bigintegrate; 5 December 2018.
Bondiombouy, C. (2015). Query processing in cloud
multistore systems. In BDA : Bases de Données
Avancées.
Budinsky, F., Steinberg, D., Ellersick, R., Grose, T. J., &
Merks, E. (2004). Eclipse modeling framework: a
developer's guide. Addison-Wesley Professional.
Chen, CL Philip et Zhang, Chun-Yang. Data-intensive
applications, challenges, techniques and technologies:
A survey on Big Data. Information Sciences, 2014, vol.
275, p. 314-347.
CloudMdsQL (2018). CloudMdsQL Compiler. http
://cloudmdsql.gforge.inria.fr/ Online ; 5 December
2018.
Douglas, L., 2001. 3d data management: Controlling data
volume, velocity and variety. Gartner. Retrieved, 6,
2001.
Drill (2018). Apache Drill. https: //drill.apache.org/ Online
; 5 December 2018.
Ecore (2018). The eclipse modeling framework project.
http ://www.eclipse.org/emf Online; 5 December 2018.
EMF (2018). Projets EMF. www.eclipse.org/stp/ and
www.eclipse.org/emf/ Online; 5 December 2018.
Gallinucci, E., Golfarelli, M., & Rizzi, S. (2018). Schema
profiling of document-oriented databases. Information
Systems, 75, 13-25.
Han, Jing, Haihong, E., LE, Guan, et al. Survey on NoSQL
database. Pervasive computing and applications
(ICPCA), 2011 6th international conference on. IEEE,
2011. p. 363-366.
Harrison, G. (2015). Next Generation Databases:
NoSQLand Big Data. Apress.
Hutchinson, J., Rouncefield, M., & Whittle, J. (2011, May).
Model-driven engineering practices in industry. In
Proceedings of the 33rd International Conference on
Software Engineering (pp. 633-642). ACM.
Klettke, M., U. Störl, et S. Scherzinger (2015). Schema
extraction and structural outlier detection for json-
based nosql data stores. Datenbanksysteme für
Business, Technologie und Web (BTW 2015).
MongoDB (2018). Mongodb atlas database as a service.
https://docs.mongodb.com/manual/reference/database-
references/ Online ; 5 December 2018.
OMG (2018). Object Management Group.
http://www.omg.org/ Online ; 5 December 2018.
Sevilla, Diego Ruiz, Severino Feliciano Morales, and Jesús
García Molina. "Inferring versioned schemas from
NoSQL databases and its applications." International
Conference on Conceptual Modeling. Springer, Cham,
2015.