tection and modeling of inter-model entity references
and other integrity constraints becomes significantly
more challenging.
ACKNOWLEDGEMENTS
This paper is based on Ivan Veinhardt Latt
´
ak’s Master
thesis (Veinhardt Latt
´
ak, 2021). This work was sup-
ported by the GA
ˇ
CR project no. 20-22276S.
REFERENCES
Baazizi, M.-A., Colazzo, D., Ghelli, G., and Sartiani, C.
(2019a). https://gitlab.lip6.fr/collab/pstl2020. (un-
available).
Baazizi, M.-A., Colazzo, D., Ghelli, G., and Sartiani, C.
(2019b). Parametric Schema Inference for Massive
JSON Datasets. The VLDB Journal.
Bex, G. J., Neven, F., Schwentick, T., and Vansummeren,
S. (2010). Inference of Concise Regular Expressions
and DTDs. ACM Trans. Database Syst., 35(2):11:1–
11:47.
Bouhamoum, R., Kellou-Menouer, K., Lopes, S., and
Kedad, Z. (2018). Scaling up Schema Discovery for
RDF Datasets. In 2018 IEEE ICDEW, pages 84–89.
IEEE.
Candel, C. J. F., Ruiz, D. S., and Garc
´
ıa-Molina, J. (2021).
A Unified Metamodel for NoSQL and Relational
Databases. CoRR.
Chill
´
on, A. H., Morales, S. F., Sevilla, D., and Molina, J. G.
(2017). Exploring the Visualization of Schemas for
Aggregate-Oriented NoSQL Databases. In ER Fo-
rum/Demos 1979, volume 1979 of CEUR, pages 72–
85.
ˇ
Conto
ˇ
s, P. and Svoboda, M. (2020). JSON Schema Infer-
ence Approaches. In ER Workshops, pages 173–183.
Springer.
DiScala, M. and Abadi, D. J. (2016). Automatic Generation
of Normalized Relational Schemas from Nested Key-
Value Data. In SIGMOD ’16, pages 295–310.
Frozza, A. A., Defreyn, E. D., and dos Santos Mello,
R. (2020). A process for inference of columnar
nosql database schemas. In Anais do XXXV Simp
´
osio
Brasileiro de Bancos de Dados, pages 175–180. SBC.
Frozza, A. A., dos Santos Mello, R., and da Costa, F.
d. S. (2018a). An Approach for Schema Extraction
of JSON and Extended JSON Document Collections.
In IRI 2018, pages 356–363. IEEE.
Frozza, A. A., dos Santos Mello, R., and da Costa,
F. d. S. (2018b). https://github.com/gbd-
ufsc/jsonschemadiscovery.
Fruth, M., Dauberschmidt, K., and Scherzinger, S. (2021).
Josch: Managing Schemas for NoSQL Document
Stores. In ICDE ’21, pages 2693–2696. IEEE.
Gallinucci, E., Golfarelli, M., Rizzi, S., Abell
´
o, A., and
Romero, O. (2018). Interactive Multidimensional
Modeling of Linked Data for Exploratory OLAP. Inf.
Syst., 77:86–104.
Izquierdo, J. L. C. and Cabot, J. (2013a). Discovering Im-
plicit Schemas in JSON Data. In ICWE ’13, pages
68–83. Springer.
Izquierdo, J. L. C. and Cabot, J. (2013b).
https://github.com/som-research/jsondiscoverer.
Izquierdo, J. L. C. and Cabot, J. (2016). JSONDiscoverer:
Visualizing the Schema Lurking behind JSON Docu-
ments. Knowledge-Based Systems, 103:52–55.
Klettke, M., Awolin, H., Storl, U., Muller,
D., and Scherzinger, S. (2017a).
https://github.com/dbishagen/darwin.
Klettke, M., Awolin, H., Storl, U., Muller, D., and
Scherzinger, S. (2017b). Uncovering the Evolution
History of Data Lakes. In 2017 IEEE International
Conference on Big Data, pages 2380–2389, New
York, United States. IEEE.
Klettke, M., St
¨
orl, U., and Scherzinger, S. (2015). Schema
Extraction and Structural Outlier Detection for JSON-
based NoSQL Data Stores. In DBIS ’15, pages 425–
444.
Ml
´
ynkov
´
a, I. and Ne
ˇ
cask
´
y, M. (2013). Heuristic Methods
for Inference of XML Schemas: Lessons Learned and
Open Issues. Informatica, 24(4):577–602.
M
¨
oller, M. L., Berton, N., Klettke, M., Scherzinger, S., and
St
¨
orl, U. (2019). jhound: Large-scale profiling of open
json data. BTW 2019.
Morales, S. F. (2017). Inferring NoSQL Data Schemas with
Model-Driven Engineering Techniques. PhD thesis,
University of Murcia, Murcia, Spain.
Scherzinger, S., Klettke, M., and St
¨
orl, U. (2013). Manag-
ing schema evolution in NoSQL data stores. In DBPL
’13.
Sevilla Ruiz, D., Morales, S. F., and Garc
´
ıa Molina,
J. (2015a). https://github.com/catedrasaes-
umu/nosqldataengineering.
Sevilla Ruiz, D., Morales, S. F., and Garc
´
ıa Molina, J.
(2015b). Inferring versioned schemas from NoSQL
databases and its applications. In Conceptual Model-
ing, pages 467–480. Springer.
Svoboda, M., Contos, P., and Holubova, I. (2021). Cate-
gorical Modeling of Multi-Model Data: One Model
to Rule Them All. In MEDI ’21, pages 1–8. Springer.
Veinhardt Latt
´
ak, I. (2021). Schema Inference for NoSQL
Databases. Master thesis, Charles University in
Prague, Czech Republic.
ENASE 2022 - 17th International Conference on Evaluation of Novel Approaches to Software Engineering
386