MfCodeGenerator: A Code Generation Tool for NoSQL Data Access
with ONM Support
Evandro Miguel Kuszera
1 a
, Leticia Mara Peres
2 b
and Marcos Didonet Del Fabro
3 c
1
Federal University of Technology, Paran
´
a, Dois Vizinhos, Brazil
2
Federal University of Paran
´
a, Curitiba, Brazil
3
Universit
´
e Paris-Saclay, CEA, List, Palaiseau, France
Keywords:
Object-NoSQL Mapper, NoSQL, Code Generation.
Abstract:
NoSQL databases are generally employed in scenarios that require horizontal scalability and flexibility in data
schema. Applications can access the NoSQL database through native APIs or through ONMs (Object-NoSQL
Mappers). The latter provides a uniform data access interface, decoupling the application from the database
and reducing vendor lock-in. However, ONM code creation should be performed by developers and can be
cumbersome and error prone. In this paper we propose an approach to generate ONM code based on a NoSQL
schema that describes the structure of the entities and their relationships. From the NoSQL schema, our
tool is used to generate code for three widely used Java-based ONMs. To evaluate the approach we perform
experiments to read and write data to and from an existing MongoDB database using the generated code.
Through the results obtained, it was possible to verify that the tool is capable of generating code according to
the NoSQL schema and the requirements of the target ONM. This not only streamlines developer access to
NoSQL data but also facilitates comparative evaluations of different ONMs utilizing the same schema.
1 INTRODUCTION
Relational databases (RDB) are the de facto standard
for storing data in most existing applications. How-
ever, the relational model based on tables with rows
and columns, primary and foreign keys, has its lim-
itations when there is a need to scale the application
horizontally (Stonebraker et al., 2007). In addition,
there is also the issue of the need to define a schema
before storing the data, which impacts its use in appli-
cations that handle semi-structured data and require
flexible data models.
NoSQL databases (Sadalage and Fowler, 2012)
emerged as a solutions for these problems, provid-
ing data model, architecture and query languages dif-
ferent from the relational model. From an applica-
tion standpoint, access to data in the NoSQL database
can be performed through native APIs or through
middlewares. In the context of relational databases
we have ORMs (Object-Relational Mapper) (O’Neil,
2008), that are middlewares which provide mech-
a
https://orcid.org/0000-0002-4040-0151
b
https://orcid.org/0000-0002-8922-6975
c
https://orcid.org/0000-0002-8573-6281
anisms to map objects from application to records
in the database and vice versa. Regarding NoSQL
databases we have ONMs (Object-NoSQL Mappers),
which map objects from application to the target
NoSQL data model.
ONMs are generally used in the development of
new applications, but there are scenarios in which the
database already exists or comes from a migration
from RDB to NoSQL. In all the cases it is necessary
create ONM software artifacts to manipulate the data.
This task should be performed by developers and can
be cumbersome and error prone, even more so if the
existing data is stored in a format other than that sup-
ported by ONM. One way to make the developer’s
work easier is to generate the ONM code automati-
cally.
There are different works that evaluate and com-
pare the ONMs in terms of features and the introduced
overhead to access the data against the native APIs
(St
¨
orl et al., 2015; Reniers et al., 2017; Rafique et al.,
2018), but none of them deal with ONM code genera-
tion. In (Chill
´
on et al., 2019), a solution was proposed
to read the data from a NoSQL database and extract
its schema. Subsequently, the approach automatically
232
Kuszera, E., Peres, L. and Fabro, M.
MfCodeGenerator: A Code Generation Tool for NoSQL Data Access with ONM Support.
DOI: 10.5220/0012557800003690
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 26th International Conference on Enterprise Information Systems (ICEIS 2024) - Volume 1, pages 232-239
ISBN: 978-989-758-692-7; ISSN: 2184-4992
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
generates ONM code for Morphia
1
and Mongoose
2
. However, it does not address limitations of ONMs
concerning the extracted NoSQL schema, which may
have a hierarchical structure with depth not supported
by ONM, or have unsupported relationship types, or
even the format of the data stored in NoSQL may not
be supported by ONM.
In this paper we present the MfCodeGenerator, an
approach for automatic code generation for ONMs
aimed at document-oriented NoSQL databases. Mf-
CodeGenerator uses a NoSQL schema as input and
currently it generates code for three Java-based ONM,
namely Impetus Kundera
3
, Data Nucleus
4
and Spring
Data
5
, but the approach is platform independent and
can be extended to others ONMs and languages. Our
approach allows customizing and adding validations
in the code generation process, allowing to verify if
the NoSQL schema is supported by the ONM, warn-
ing the developer in case it cannot be defined in the
ONM code.
To evaluate the MfCodeGenerator, we conducted
experiments to read and write data in MongoDB us-
ing the generated code. Based on the results obtained,
we verified that the tool is capable of generating code
according to the NoSQL schema and requirements of
the target ONM. This simplifies the process for de-
velopers to access NoSQL data and also facilitates
the evaluation of different ONMs based on the same
schema. The main contributions of this paper are:
Our approach automatically generate code from a
NoSQL schema, adding necessary annotations to
support the ONM;
Developer can add customizations to MfCode-
Generator, to introduce annotations or specific
code into the generation process.
The MfCodeGenerator validates the schema
against the target ONM and warns if certain types
of relationships are not supported. This allows
the developer to either adjust the input schema ac-
cordingly or be informed that modifications to the
generated code will be necessary.
The remainder of this paper is structured as fol-
lows. Section 2 provides the context and necessary
background of the proposed approach. Section 3
presents the MfCodeGenerator, our approach to au-
tomatically generated ONM code, describing its exe-
cution flow and architecture. Section 4 presents the
1
https://github.com/MorphiaOrg/morphia
2
https://mongoosejs.com/
3
https://github.com/Impetus/Kundera
4
https://www.datanucleus.org/products/accessplatform/
5
https://spring.io/projects/spring-data
experiments to evaluate the approach. Section 5 dis-
cusses related work. Section 6 concludes the paper.
2 CONTEXT AND BACKGROUND
The approach proposed in this paper aims to au-
tomatically generate code to access data stored in
document-oriented NoSQL databases. As there are
different ways to model entities as collections of doc-
uments, it is interesting to have a way to define a
schema and generate the code to create, retrieve, up-
date and delete data from the NoSQL database, with
support for different middlewares in a simplified way.
The approach proposed in this paper can be applied in
different scenarios:
Application Development: Developers provide a
data schema, and our approach generates code tai-
lored to the chosen ONM.
ONM Evaluation: Our approach facilitates com-
paring different ONMs by generating custom code
for each from the same input schema.
RDB to NoSQL Migration: We can generate a
NoSQL schema based on metadata from the re-
lational database, utilizing it to generates code to
read data from the RDB and write into MongoDB.
Integration with Migration Tools: Through
adapters, our tool can utilize data schema from
other tools to generate code for accessing mi-
grated data.
In a previous study, we introduced a framework
for migrating data from RDB to NoSQL (Kuszera
et al., 2019). Our framework offers a schema that out-
lines the data structure in the NoSQL database. Sub-
sequently, this schema is employed to generate com-
mands for data migration. In this paper we extend our
schema and use it as input to generate ONM code to
access the NoSQL database. The following sections
present an overview of the schema and the Object-
NoSQL mappers supported by the approach.
2.1 NoSQL Schema
We use a set of DAGs (Directed Acyclic Graph) to
represent a NoSQL schema. Each DAG in the schema
represents a NoSQL entity and it has a tree struc-
ture. The root vertex represents the root document and
the remaining vertices represent the embedded docu-
ments. The edges inside a DAG define the type of
embedding relationship (document or array of docu-
ment). In this paper we extend our schema notation
MfCodeGenerator: A Code Generation Tool for NoSQL Data Access with ONM Support
233
to allows reference relationships between DAGs (en-
tities). Figure 1 shows the DAG elements and an ex-
ample of entity.
Figure 1: DAG elements (left) and an example of DAG en-
tity (right) composed by three vertices and two edges.
The vertices encapsulate the metadata of the re-
spective document, including name, fields, data types
and which fields are identifiers (primary keys) or ref-
erence other fields (foreign keys). The edges encapsu-
late the metadata of the relationship between two ver-
tices (documents), including the primary and foreign
keys, which document is on side one or side many
of the relationship and the type of relationship (doc-
ument embedded, array embedded or reference). The
edge direction shows how to embed the data to create
an entity. Figure 1 (right side) shows an entity com-
posed by three vertices, where the gray color vertex is
the root document and the other vertices are the em-
bedding documents.
From the perspective of migrating from RDB
to NoSQL, developers can create a NoSQL schema
based on the metadata of the RDB. Figure 2 illustrates
the relational schema of a database storing data about
a DVD store. From this relational database we de-
fine the schema with three DAGs of Figure 3. For the
sake of simplicity, only the fields of type document
are showed, but the entities also have fields of simple
types that are suppressed. This schema can be lever-
aged to generate code to aid in data migration or to
access migrated data from another tool. In this step,
data access middlewares can be used.
Figure 2: Relational schema of the database used to illus-
trate the proposed approach.
Figure 3: NoSQL schema represented by a set of DAGs.
2.2 Object-NoSQL Mappers (ONMs)
ONMs offer advantages to applications, including de-
coupling from the database, enhancing portability,
and simplifying NoSQL access, thereby boosting pro-
ductivity. However, despite their benefits, encoding
with ONMs remains a laborious and error-prone task.
In this work we focus on Impetus Kundera,
DataNucleus and Spring Data. All of them provide
access to relational and non-relational databases, and
support for CRUD operations (Create, Retrieve, Up-
date, and Delete). They also support MongoDB,
the most widely used document-oriented NoSQL.
DataNucleus implements JPA (Java Persistence API)
and JDO (Java Data Object) interfaces to access the
database, meanwhile Kundera implements only JPA
interface. Both implements JPQL (Java Persistence
Query Language) to query the database. Spring Data
provides custom interface to access the database, by
object-mapping abstractions based on repository con-
cept and annotations. Although the three ONM above
are based on annotations to configure data access,
they differ in terms of configuration, format and num-
ber of annotations available.
3 MfCodeGenerator
In this section we present our approach, named Mf-
CodeGenerator
6
. Figure 4 shows the execution flow
of the approach.
MfCodeGenerator takes as input a DAG schema
and an ONM config, and generates as result ONM
code to access data in document databases. The main
components of it are MfSchemaGenerator, MfCus-
tomization and CodeGenerator. In the first phase (1),
MfSchemaGenerator traverses the DAG Schema and
converts it into a MfSchema, which is an abstraction
to represent the DAG entities and their relationships
6
https://github.com/evandrokuszera/metamorfose-
code-generator
ICEIS 2024 - 26th International Conference on Enterprise Information Systems
234
Figure 4: Execution flow of the MfCodeGenerator.
using a Java class notation. At this point, we have a
MfSchema that is agnostic of any ONM. In the sec-
ond phase (2) the generated MfSchema is passed to
MfCustomization, in which it is enriched with new
annotations, fields and imports, according to target
ONM config. In the last phase (3), the CodeGener-
ator saves the enriched MfSchema as Java classes on
disk for developers to use in Java projects for NoSQL
database data operations. The target NoSQL database
is MongoDB, but it can be extended to support new
ONMs and databases.
Figure 5: Class diagram of MfCodeGenerator.
Figure 5 depicts the class diagram of MfCode-
Generator, where MfSchema serves as the founda-
tion. It comprises one or more MfEntity objects, that
can be interconnected via MfRelationships. MfEntity
structures can be simple, composed of a single Class-
Metadata object, or hierarchical, comprising multiple
ClassMetadata objects. ClassMetadata encapsulates
entity metadata akin to Java class structure, encom-
passing fields, imports, and annotations represented
by ClassField, ClassImport, and Annotation classes.
Furthermore, ClassMetadata retains metadata regard-
ing identifier fields, relationship fields, and relation-
ship direction.
MfCustomization is an abstract class that provides
an extension point for adding customizations to Mf-
Figure 6: Example of ONM code generated from a NoSQL
schema with two entities (Customers and Orders).
Schema and its respective MfEntity objects, accord-
ing to target ONM.
3.1 Code Generation Process
To illustrate the process to generate ONM code let us
considering the scenario of Figure 6.
Figure 6a shows a NoSQL schema that is com-
posed of entities Customers and Orders. This schema
must be previously created by the developer accord-
ing to the application’s requirements (for example, its
access pattern).
Figure 6b shows the pseudocode of MfCodeGen-
erator. Each schema DAG of input schema is tra-
versed to generate MfEntity objects from vertices and
edges. The information encapsulated in the DAG is
used to create the ClassMetadata objects, denoting the
structure of the entity. After that, these objects are
customized according to the target ONM (applyCus-
tomizationTo) and added to the list of MfSchema enti-
ties. Finally, MfSchema is also customized, allowing
to establish relationships between the entities created
in the previous step.
Figure 6c shows an in-memory representation of
MfSchema, with the generated ONM code. At that
point, the developer can either inspect the schema or
save the code to disk.
The code snipped in the Figure 7 shows how to
call the MfCodeGenerator in a Java project to create
Impetus Kundera ONM code inside a package named
model. As output, two packages are generated, named
model.customer and model.orders”, with their re-
spective classes to denote the entities Customers and
Orders.
MfCodeGenerator: A Code Generation Tool for NoSQL Data Access with ONM Support
235
p u b l i c c l a s s G e ne r a t or {
p u b l i c s t a t i c v oi d main ( S t r i n g [ ] a r g s )
t hr o ws F il e N ot F o un d Ex c e pt i o n {
/ / L oad in g t h e NoSQL Schema fro m d i s k
NoSQLSchema s chema = l oa dN osq lS ch ema ( schema . j s o n ” ) ;
/ / Schema g e n e r a t o r f o r K un de ra ONM an d MongoDB
Mf S ch e m aG e ne r a to r s c h e m a C o d e G e n e r a t o r = new
Mf Da gS ch em aG en er ato r ( sch ema , new
Mf K un d er a M on g oC u s to m iz a t io n ( ) ) ;
/ / G e n e r a t i n g t h e sche ma u s i n g t h e p a c k a g e name mo del
s c h e m a C o d e G e n e r a t o r . g e n e r a t e ( mod el ) ;
/ / S a v i n g t h e g e n e r a t e d J a v a c l a s s e s i n t o t h e p r o j e c t
s c h e m a C o d e G e n e r a t o r . s a v e F i l e s ( ) ;
}
}
Figure 7: Code snipped to execute the MfCodeGenerator in
a Java project.
3.2 ONM Config
In our approach an ONM config is denoted by a sub-
class of MfCustomization, in which the developer can
provide implementations to enriched the generated
code with new annotations, imports and fields.
Figure 8 shows a snippet of ONM customization
code implemented for Impetus Kundera. This code
is called by MfSchemaGenerator to apply customiza-
tions to all ClassMetadata objects of the MfEntities
created in the previous stage. In the first method of the
sample code, all root classes of schema are annotated
with @Entity and all nested classes with @Embed-
dable. In the second method, classes are annotated
with @OneToMany and @ManyToOne to establish
relationships between entities. Besides Impetus Kun-
dera, we provide a default implementation for Spring
Data and Data Nucleus (see repository), but the de-
veloper can provide new implementations.
p u b l i c c l a s s M f Ku n de r a Mo n go C u st o mi z a ti o n
e x t e n d s M f C l a s s C u s t o m i z a t i o n {
/ / A p p l y i n g c u s t o m i z a t i o n s f o r e a c h e n t i t y
p u b l i c v o i d a p p l y C u s t o m i z a t i o n s T o ( M f E n t i t y en ) {
en . g e t R o o t C l a s s ( ) . a d d I m p o r t ( j a v a x . p e r s i s t e n c e . E n t i t y ” ) ;
en . g e t R o o t C l a s s ( ) . a d d I m p o r t ( j a v a x . p e r s i s t e n c e . I d ” ) ;
en . g e t R o o t C l a s s ( ) . a d d A n n o t a t i o n ( @ E nt i t y ) ;
en . g e t R o o t C l a s s ( ) . g e t I d ( ) . a d d A n n o t a t i o n ( @Id ) ;
en . g e t R o o t C l a s s ( ) . a d d F i e l d ( ” p r i v a t e ” , ” i n t ” , cod ) ;
/ / . . .
f o r ( C l a s s M e t a d a t a n e s t e d : e n t i t y . g e t N e s t e d C l a s s e s ( ) ) {
n e s t e d . a d d I m p o r t ( j a v a x . p e r s i s t e n c e . Emb ed da bl e ) ;
n e s t e d . a d d A n n o t a t i o n ( @Embeddable ) ;
/ / . . .
}
}
/ / A p p l y i n g c u s t o m i z a t i o n s f o r s ch ema
p u b l i c v o i d a p p l y C u s t o m i z a t i o n s T o ( MfSchema sche ma ) {
f o r ( R e l a t i o n s h i p r e f : s che ma . r e l a t i o n s h i p s ) {
r e f . o n e E n t i t y . a d d I m p o r t ( j a v a x . p e r s i s t e n c e . OneToMany ) ;
r e f . m a n y E n t i t y . a d d I m p o r t ( ” ja v a x . p e r s i s t e n c e . ManyToOne) ;
r e f . o n e E n t i t y . a d d A n n o t a t i o n ( @OneToMany( mappedBy = . . . ) ;
r e f . m a n y E n t i t y . a d d A n n o t a t i o n ( @ManyToOne ) ;
/ / . . .
}
}
}
Figure 8: Code snipped with example of customizations for
the ONM Impetus Kundera.
4 EXPERIMENTS
To evaluate the MfCodeGenerator let us consider the
scenario in which there is a Java application that needs
to persist data in MongoDB database. We choose
MongoDB because it is the most popular document
oriented NoSQL database
7
. Rather than creating the
code to access the data manually, we will define a
NoSQL schema and provide it as input to the Mf-
CodeGenerator. Three ONMs are used in the experi-
ments: Spring Data, Data Nucleus and Impetus Kun-
dera.
Figure 9: NoSQL schema that represents how the data are
stored in the MongoDB.
Figure 9 illustrates the NoSQL schema employed
in the experiments. We derived this schema from the
relational schema depicted in Figure 2. For the sake of
simplicity, only the names of collections and embed-
ded documents are presented. The schema comprises
four collections: Customers, Orders, Products, and
Categories. There are two types of relationships be-
tween entities: references and nesting, e.g. Customers
and Orders have a reference relationship whereas Or-
ders and Orderlines have a nesting relationship. Cat-
egories appears twice in the schema, as a collection
and as an embedded object in Products collection.
We choose this NoSQL schema because it
presents different types of relationships (references
and nesting) and data redundancy (Categories), which
are interesting aspects to evaluate the MfCodeGener-
ator. However, due to flexibility of NoSQL databases,
it is possible to structure the data in different ways.
4.1 ONM Code Generation
To carry out the evaluation we create a Java project
for each ONM with dependencies for MongoDB and
MfCodeGenerator. Then, we execute the MfCode-
Generator, providing as input parameters the NoSQL
schema, target ONM, and the base package name for
saving the generated classes within the project.
Figure 10 shows the list of generated Java classes
for the Data Nucleus. The number and name of
generated classes are the same for the other ONMs,
since the input NoSQL schema used is the same.
There are four code packages in which each package
7
https://db-engines.com/en/ranking
ICEIS 2024 - 26th International Conference on Enterprise Information Systems
236
Figure 10: Generated ONM code for data access.
stores one entity of the schema. An entity is com-
posed by one root class, that represent the root docu-
ment of the collection and it could have other classes
that represent nested documents. For example, pack-
age mf.model.nosql.categories stores a single class to
represent the document structure of Categories col-
lection and package mf.model.nosql.products stores
three classes to represent the document structure of
collection Products, where Products is the root class
and Categories and Inventory classes are the nested
documents.
MfCodeGenerator generates code that conforms
to the NoSQL schema, adding attributes to estab-
lish relationships between classes. Figures 11, 12
and 13 present the code generated by MfCodeGener-
ator, highlighting the ONM annotations added in the
classes. For the sake of simplicity, we only show iden-
tifier fields and relationship fields, suppressing other
fields and getter and setter methods. The Categories
entity is not presented because the code is similar to
Customers and it does not bring new aspects to high-
light.
/ / SPRING DATA
@Document ( c o l l e c t i o n = Cu s t o m e r s ” )
p u b l i c c l a s s C u s t o m e r s {
@Id
p r i v a t e S t r i n g i d ;
p r i v a t e I n t e g e r i d c u s t o m e r ;
@R e ad O nl y Pr o pe r ty
@Do cumen tR efe re nce ( l o o k u p =
{ ’ c u s t o m e r i d : ? # { # s e l f . i d c u s t o m e r }} )
p r i v a t e L i s t<Orde r s> o r d e r s ;
}
/ /DATA NUCLEUS a nd IMPETUS KUNDERA
@E n t ity ( name = C u s t o m e r s )
p u b l i c c l a s s C u s t o m e r s {
@Id
p r i v a t e I n t e g e r i d c u s t o m e r ;
@OneToMany( mappedBy = c u s t o m e r i d ” )
p r i v a t e L i s t<Orde r s> o r d e r s ;
}
Figure 11: Customers code generated for all ONMs.
4.2 ONM Code Customization
It is worth noting the use of @AttributeOverride an-
notation in Products class for Data Nucleus (Figure
12). The annotation defines the attributes name of the
/ / DATA NUCLEUS
@E n t ity ( name=” P r o d u c t s ” )
p u b l i c c l a s s P r o d u c t s {
@Id
p r i v a t e I n t e g e r i d p r o d ;
@Embedded
@ A t t r i b u t e O v e r r i d e ( name=
” p r o d i d ” , c ol umn =
@Column( name= ” p r o d i d ” ) )
@ A t t r i b u t e O v e r r i d e ( name=
” q u a n s t o c k ” , column=
@Column( name= ” q u a n s t o c k ” ) )
@ A t t r i b u t e O v e r r i d e ( name=
” s a l e s ” , colu mn =
@Column( name= ” s a l e s ) )
p r i v a t e I n v e n t o r y i n v e n t o r y ;
@Embedded
@ A t t r i b u t e O v e r r i d e ( name=
” i d c a t e g o r y ” , colu mn =
@Column( name= ” i d c a t e g o r y ” ) )
@ A t t r i b u t e O v e r r i d e ( name=
ca t e g o r y n a m e , colu mn =
@Column( name= ” ca t g o r y n a m e ” ) )
p r i v a t e C a t e g o r i e s c a t s ;
@OneToMany( mappedBy=
” p r o d i d ” )
p r i v a t e L i s t<O r d e r l i n e s >
o l i n e s ;
}
/ / IMPETUS KUNDERA
@E n t ity ( name=” P r o d u c t s ” )
p u b l i c c l a s s P r o d u c t s {
@Id
p r i v a t e I n t e g e r i d
p r o d ;
@Embedded
p r i v a t e I n v e n t o r y i n v e n t o r y ;
@Embedded
p r i v a t e C a t e g o r i e s c a t s ;
p r i v a t e L i s t<O r d e r l i n e s >
o l i n e s ;
}
/ / DATA NUCLEUS a nd
IMPETUS KUNDERA
@Embeddable
p u b l i c c l a s s C a t e g o r i e s {
I n t e g e r i d c a t e g o r y ;
S t r i n g c a t e g o r y n a m e ;
}
@Embeddable
p u b l i c c l a s s I n v e n t o r y {
I n t e g e r p r o d i d ;
I n t e g e r q u a n i n s t o c k ;
I n t e g e r s a l e s ;
}
/ / SPRING DATA
@Document ( c o l l e c t i o n =
” P r o d u c t s ” )
p u b l i c c l a s s P r o d u c t s {
@Id
p r i v a t e S t r i n g i d ;
p r i v a t e I n t e g e r i d p r o d ;
p r i v a t e I n v e n t o r y
i n v e n t o r y ;
p r i v a t e C a t e g o r i e s c a t s ;
p r i v a t e L i s t<O r d e r l i n e s >
o l i n e s ;
}
p u b l i c c l a s s C a t e g o r i e s {
I n t e g e r i d c a t e g o r y ;
S t r i n g c a t e g o r y n a m e ;
}
p u b l i c c l a s s I n v e n t o r y {
I n t e g e r p r o d i d ;
I n t e g e r q u a n i n s t o c k ;
I n t e g e r s a l e s ;
}
Figure 12: Products, Categories and Inventory code gener-
ated for all ONMs. Getters and setters are omitted.
/ / DATA NUCLEUS
@E n t ity ( name = O r d e r s )
p u b l i c c l a s s O r d e r s {
@Id
p r i v a t e I n t e g e r i d o r d e r ;
@Embedded
p r i v a t e L i s t<O r d e r l i n e s >
o l i n e s ;
@ManyToOne
p r i v a t e C u s t o m e r s cu s t om ;
}
@Embeddable
p u b l i c c l a s s O r d e r l i n e s {
p r i v a t e I n t e g e r o l i n e i d ;
p r i v a t e I n t e g e r o r d e r i d ;
p r i v a t e I n t e g e r p r o d i d ;
p r i v a t e I n t e g e r q u a n t i t y ;
p r i v a t e D at e o l i n e d a t e ;
@ManyToOne
p r i v a t e P r o d u c t s p r o d u c t s ;
}
/ / IMPETUS KUNDERA
@E n t ity ( name=” Or d e r s ” )
p u b l i c c l a s s O r d e r s {
@Id
p r i v a t e I n t e g e r i d
o r d e r ;
@ E l e m e n t C o l l e c t i o n
p r i v a t e L i s t<O r d e r l i n e s >
o l i n e s ;
@ManyToOne
@JoinColumn ( name=
” cu s t o m e r i d ” )
p r i v a t e C u s t o m e r s cu s t om ;
}
@Embeddable
p u b l i c c l a s s O r d e r l i n e s {
p r i v a t e I n t e g e r o l i n e i d ;
p r i v a t e I n t e g e r o r d e r i d ;
p r i v a t e I n t e g e r p r o d i d ;
p r i v a t e I n t e g e r q u a n t i t y ;
p r i v a t e D at e o l i n e d a t e ;
p r i v a t e P r o d u c t s p r o d u c t s ;
}
/ / SPRING DATA
@Document ( c o l l e c t i o n =
O r d e r s ” )
p u b l i c c l a s s O r d e r s {
@Id
p r i v a t e S t r i n g i d ;
p r i v a t e I n t e g e r i d o r d e r ;
p r i v a t e L i s t<O r d e r l i n e s >
o l i n e s ;
@Do cumen tR efe re nce ( l o o k u p =
{ ’ i d c u s t o m e r : ? # { c u s t o m e r i d
}} )
p r i v a t e C u s t o m e r s cu s t om ;
}
p u b l i c c l a s s O r d e r l i n e s {
p r i v a t e I n t e g e r o l i n e i d ;
p r i v a t e I n t e g e r o r d e r i d ;
p r i v a t e I n t e g e r p r o d i d ;
p r i v a t e I n t e g e r q u a n t i t y ;
p r i v a t e D at e o l i n e d a t e ;
@Do cumen tR efe re nce ( l o o k u p =
{ ’ i d p r o d : ? # { p r o d i d }} )
p r i v a t e P r o d u c t s p r o d u c t s ;
}
Figure 13: Orders and Orderlines code generated for all
ONMs.
MfCodeGenerator: A Code Generation Tool for NoSQL Data Access with ONM Support
237
Inventory and Categories classes, otherwise the ONM
uses a different strategy for naming the data fields in
MongoDB, preventing the loading of data from the
database. Adding annotations like @AttributeOver-
ride can be configured in MfCodeGenerator for a spe-
cific ONM.
The embedded relationships were all generated by
MfCodeGenerator according to the schema. How-
ever, it was not possible to generate all reference-type
relationships. Table 1 shows the list of reference-type
relationships supported by each ONM. We use a no-
tation entity-direction-entity to represent the relation-
ships.
Table 1: List of reference-type relationships defined in the
NoSQL schema supported by ONM and MfCodeGenerator.
Relationship Spring Impetus Nucleus
Customers Orders Yes Yes Yes
Orders Customers Yes Yes Yes
Orders.Orderlines Product Yes Not Yes
Product Orderlines Not Not Not
The relationship between Products and Order-
lines is generated only in the Orders.Orderlines
Products direction for Spring Data and Data Nu-
cleus. In the opposite direction, the relationship
Products Orders.Orderlines does not work for any
ONM. For Impetus Kundera it is possible to establish
the reference-type relationships only between classes
that are annotated with @Entity (see ONM documen-
tation). The reason for this is that Orderlines is not a
root entity, and it is not possible to load the document
from the database directly.
It is important to note that the generated code
conforms to the NoSQL schema and annotations are
added to the code only if they are supported by the
target ONM.
4.3 Evaluation of ONM Code
To evaluate the generated code we seek to answer two
questions: (Q1) Is the generated code able to create,
retrieve, update and delete documents in MongoDB?
(Q2) Do the generated code and ONM support the
relationships from the NoSQL schema?
4.3.1 Answer to Q1
To answer the first question, we perform CRUD op-
erations using the generated code for all the ONMs.
The main objective is check if the generated code can
write entities into MongoDB and, then read the data
into memory. For Data Nucleus, Impetus Kundera
and Spring Data it was possible to create an entity,
read, update and persist it again in MongoDB. This
test did not include related data from other entities.
4.3.2 Answer to Q2
For second question we read one document of Cat-
egories, Customers, Orders and Products collections
and check if after reading the root entity the related
data could also be loaded. For all the ONMs the em-
bedded entities could be automatically loaded from
MongoDB without problem and the reference-type
relationships all the related root entities could load
data automatically. However, loading related non-
root entities works only for Spring Data and Data Nu-
cleus in the direction embedded-entity to root-entity,
as showed in Table 1.
We repeated the above two tests with a larger num-
ber of documents (Customers: 10k, Orders: 20k, Or-
derlines: 40k, Products: 10k and Categories: 20). The
results were the same, making it possible to write data
to MongoDB and then, read the data correctly.
4.4 Discussion of the Results
MfCodeGenerator automates the generation of code
for NoSQL database access following the input
NoSQL schema, however, it is necessary to consider
the ONM limitations, such as support for certain types
of relationships and the supported depth of nesting.
MfCodeGenerator offers a validation method to
check if the NoSQL schema aligns with the capabil-
ities of the target ONM, notifying the developer of
any unsupported relationships. For example, when
using Impetus Kundera, MfCodeGenerator flags re-
lationships between entities lacking the @Entity an-
notation. Subsequently, developers can choose to ad-
just the schema or generate the code and implement
required modifications accordingly.
In future work, we aim to broaden the tool’s eval-
uation scope by incorporating complex schemas and
diverse NoSQL databases and ONMs.
5 RELATED WORK
There are works that evaluate and compare the ONMs
in terms of features (CRUD operations and query sup-
port) and the introduced overhead to access the data
against the native APIs (St
¨
orl et al., 2015; Reniers
et al., 2017). In (Reniers et al., 2019) the authors
presented a survey with an comprehensive compari-
son of eleven ONMs, in terms of program language,
database, query interface and mapping strategies sup-
port. On the contrary, in this work we propose an ap-
proach for automatic generation of ONM code, that
can be used to assist in the evaluation of ONMs.
ICEIS 2024 - 26th International Conference on Enterprise Information Systems
238
The work presented in (Chill
´
on et al., 2019) in-
troduces a solution for generating ONM code from a
NoSQL schema extracted from an existing database.
While the solution supports Morphia and Mongoose,
it does not address the limitations of ONMs, such
as differences in document embedding levels and al-
lowed annotations for relationship types like @One-
ToOne and @OneToMany. Additionally, variations
in persisting data formats to represent reference rela-
tionships across ONMs are not discussed, necessitat-
ing consideration when reading data from an existing
database.
Our solution is similar to (Chill
´
on et al., 2019), in
which we can generate ONM code from our NoSQL
schema definition. However, MfCodeGenerator focus
in Java-based ONM and support Impetus Kundera,
Data Nucleus and Spring Data. Furthermore, our ap-
proach allows customizing and adding validations in
the code generation process, alerting developers if it
cannot be defined in the generated code.
MongoDB introduced the Relational Migrator
tool
8
, enabling the creation of a NoSQL schema from
an existing RDB and subsequent data migration. Ad-
ditionally, the tool generates ONM code for Java
(Spring Data). However, manual modifications are
still required for the generated code before integration
into the application, and reference-type relationships
between classes are not generated. In contrast, our ap-
proach supports a broader range of ONMs and can be
integrated with MongoDB tool to generate code for
other ONMs.
6 CONCLUSION
This paper introduced the MfCodeGenerator, an ap-
proach to automatically generate ONM code from an
input NoSQL schema and a set of customization rules
for the target ONM. As a result, a set of Java classes
is generated, enriched with the code to allow reading
and writing objects in the NoSQL database.
MfCodeGenerator generates code for Spring Data,
Data Nucleus and Impetus Kundera ONMs, but it is
worth noting that the approach is platform indepen-
dent and can be extended to other ONMs and lan-
guages. Experimental results demonstrate that the
tool generates the Java classes correctly, with imports,
annotations, fields, methods and relationships, adher-
ing to the specified NoSQL schema. The generated
code was evaluated by performing CRUD operations
on MongoDB.
8
https://www.mongodb.com/products/relational-
migrator
Not all features specified in the NoSQL schema
could be translated directly into the generated code,
mainly due to constraints imposed by the target ONM,
such as limitations on supported relationship types
and data storage formats required by MongoDB. To
address this, MfCodeGenerator issues warnings to de-
velopers, indicating whether the selected ONM fully
supports the provided NoSQL schema.
As future works it is intended to add support for
new ONMs and NoSQL databases and improve the
mechanism for defining rules for customizing the gen-
erated code, which today is embedded in the MfCode-
Generator source code. Furthermore, we aim to ex-
pand the evaluation of the tool to encompass more
complex scenarios.
REFERENCES
Chill
´
on, A. H., Ruiz, D. S., Molina, J. G., and Morales, S. F.
(2019). A model-driven approach to generate schemas
for object-document mappers. IEEE Access, 7:59126–
59142.
Kuszera, E. M., Peres, L. M., and Fabro, M. D. D.
(2019). Toward rdb to nosql: Transforming data
with metamorfose framework. In Proceedings of the
34th ACM/SIGAPP Symposium on Applied Comput-
ing, page 456–463, New York, NY, USA.
O’Neil, E. J. (2008). Object/relational mapping 2008: Hi-
bernate and the entity data model (edm). In Proceed-
ings of the 2008 ACM SIGMOD, page 1351–1356,
New York, NY, USA.
Rafique, A., Landuyt, D. V., Lagaisse, B., and Joosen, W.
(2018). On the performance impact of data access
middleware for nosql data stores A study of the trade-
off between performance and migration cost. IEEE
Trans. Cloud Comput., 6(3):843–856.
Reniers, V., Landuyt, D. V., Rafique, A., and Joosen, W.
(2019). Object to nosql database mappers (ONDM):
A systematic survey and comparison of frameworks.
Inf. Syst., 85:1–20.
Reniers, V., Rafique, A., Landuyt, D. V., and Joosen, W.
(2017). Object-nosql database mappers: a benchmark
study on the performance overhead. J. Internet Serv.
Appl., 8(1):1:1–1:16.
Sadalage, P. J. and Fowler, M. (2012). NoSQL Distilled: A
Brief Guide to the Emerging World of Polyglot Persis-
tence. Addison-Wesley Professional, 1st edition.
Stonebraker, M., Madden, S., Abadi, D. J., Harizopoulos,
S., Hachem, N., and Helland, P. (2007). The end of an
architectural era (it’s time for a complete rewrite). In
Proceedings of the 33rd VLDB, University of Vienna,
Austria, 2007, pages 1150–1160.
St
¨
orl, U., Hauf, T., Klettke, M., and Scherzinger, S. (2015).
Schemaless nosql data stores - object-nosql mappers
to the rescue? In Datenbanksysteme f
¨
ur Business,
Technologie und Web (BTW), volume P-241 of LNI,
pages 579–599. GI.
MfCodeGenerator: A Code Generation Tool for NoSQL Data Access with ONM Support
239