A Framework Model for Supporting Transparent Polyglot Persistence

with a Uniﬁed API and Extensible for Different Database Types

Fernando de Oliveira Pereira

1,3 a

, Eduardo Martins Guerra

2 b

and Reinaldo Roberto Rosa

1 c

National Institute for Space Research (INPE), S

ao Jos

e dos Campos/SP, Brazil

Free University of Bozen-Bolzano - UNIBZ, Bozen-Bolzano, Italy

National Centre for Monitoring and Early Warning of Natural Disasters (CEMADEN), S

ao Jos

e dos Campos/SP, Brazil

Keywords:

Polyglot Persistence, Information System Integration.

Abstract:

This work introduces the Transparent Polyglot Persistence Framework Model (TPPFM) for supporting poly-

glot persistence through a uniﬁed API for extension. The framework employs the Esﬁnge Query Builder as its

basis, restructuring it to provide polyglot functionality in alignment with the proposed framework model. A

real database case study is conducted to demonstrate the viability of the proposed framework and its reference

implementation. The ease of implementation for the developer and the transparency concerning the utilization

of several databases within the same domain model are demonstrated.

1 INTRODUCTION

Today, applications must handle diverse datasets from

the same domain (Khine and Wang, 2019; Eisenhuth

and Jablonski, 2022). For example, in e-Commerce,

multiple structures, including relational databases

for ﬁnancial data and inventory, document-oriented

databases for product catalogs, graphs for consumer

recommendations, key-value pairs for shopping cart

items, and columnar databases for activity logs, can

coexist. Different data manipulation paradigms exist

for each format.

Each storage format works well in speciﬁc sce-

narios. Some are better for short, frequent reads,

while others favor efﬁcient records. Using each type’s

strengths in their ideal situations makes sense. How-

ever, connecting different data types within an ap-

plication is challenging. Thus, modern applications

may require more than classic relational databases for

functional and non-functional requirements (Wiese,

2015; de Ara

ujo et al., 2016).

Polyglot persistence is a strategy that utilizes mul-

tiple database systems within a single application do-

main to handle diverse datasets. This approach opti-

mizes the performance of various components of an

application by capitalizing on the unique capabilities

https://orcid.org/0009-0006-2360-1281

https://orcid.org/0009-0006-2894-9076

https://orcid.org/0000-0002-2962-4322

of their respective database technologies. Using mul-

tiple database technologies in a single application re-

quires careful consideration of the most suitable per-

sistence model for each speciﬁc purpose. This strate-

gic approach allows ﬁrms to effectively utilize the

most appropriate technologies for their speciﬁc re-

quirements, such as SQL databases for structured data

or NoSQL databases for more adaptable or extensive

data storage (Schaarschmidt et al., 2015; Villac¸a et al.,

2018).

However, implementing polyglot persistence

presents signiﬁcant questions, particularly in accu-

rately linking data across multiple storage formats

within a single application (Wiese, 2015; Lajam and

Mohammed, 2022). Polyglot persistence presents

challenges in terms of integration and data manage-

ment. Proﬁciency in multiple database paradigms and

APIs is necessary to access and modify data stored

in different formats, raising concerns about interoper-

ability, referential integrity, and security across multi-

ple storage systems (Srivastava and Shekokar, 2016).

These issues have prompted the proposal of var-

ious solutions, including uniﬁed query languages,

middlewares, and multi-model databases (Jim

enez-

Peris et al., 2016) (Schaarschmidt et al., 2015) (Lu

et al., 2018). These solutions are examined in section

2 from several developments. These developments

aim to provide the use of several database systems

within a single application. However, developers of-

ten face architectural inefﬁciencies and an increased

Pereira, F. O., Guerra, E. M. and Rosa, R. R.

A Framework Model for Supporting Transparent Polyglot Persistence with a Uniﬁed API and Extensible for Different Database Types.

DOI: 10.5220/0013369300003929

In Proceedings of the 27th International Conference on Enterprise Information Systems (ICEIS 2025) - Volume 1, pages 109-120

ISBN: 978-989-758-749-8; ISSN: 2184-4992

109

development of maintenance. By implementing more

cohesive and clear data access methods, these ad-

vancements have the potential to result in software

solutions that are more resilient and adaptable, highly

suitable for contemporary applications (Villac¸a et al.,

2018).

There is an increasing need for tools and frame-

works that improve the robustness of polyglot per-

sistence and reduce developer workload. No exist-

ing work offers an extendable approach that employs

a uniﬁed extension API for several database types,

remains transparent to the developer, and leverages

the special characteristics of each API without mix-

ing them.

The present research introduces the Transpar-

ent Polyglot Persistence Framework Model, abbre-

viated as TPPFM. This is a conceptual framework

that delineates the approach for establishing polyglot

persistence via a uniﬁed and declarative metadata-

driven API. The model’s contributions include deﬁn-

ing an architecture that facilitates a cohesive ex-

tension API and orchestrates polyglot persistence

through database-speciﬁc libraries, all while remain-

ing transparent to the developer.

A reference implementation of the model was cre-

ated in a framework called Esﬁnge Query Builder

(Guerra, 2014, p. 12), which provides support for im-

plementing a transparent persistence layer for classes

in the same domain model that should be persisted in

different types of databases. The framework has an

extensible structure that allows the addition of new

drivers that can support polyglot persistence for dif-

ferent types of databases. A set of tests demonstrated

that the solution worked in four databases of different

types, namely PostgreSQL, MongoDB, Neo4J, and

Cassandra. As a result, the data could persisted in

any combination of them by changing only the meta-

data conﬁguration. Moreover, the framework was in-

tegrated into a virtual lab platform and used to com-

bine data from a relational database of alerts provided

by CEMADEN with data in JSON format provided

by IBGE stored in a MongoDB. This implementation

was used to evaluate the applicability of the solution

for a real-practical case. Modularity analysis was per-

formed on the classes in the evaluation to investigate

the coupling of the classes with the APIs of the spe-

ciﬁc databases.

The work is organized as follows: Section 2 shows

related works, Section 3 offers a succinct overview of

the key concepts pertinent to this study, Section 4 con-

ceptually delineates the PPFM, Section 5 illustrates

and explains the model’s implementation, Section 6

introduces the case study, Section 7 analyzes the re-

sults, and Section 8 provides the conclusions.

2 RELATED WORKS

The studies by (Prasad and Avinash, 2014) and (Kaur

and Rani, 2015) delineate systems exhibiting poly-

glot characteristics, where data manipulation occurs

in databases of different types. These works feature

a combination of data stored in relational and non-

relational databases, approached architecturally such

that these data are managed separately in the service

layer, meaning the databases are accessed in isolation

without any abstraction for their correlation.

Another approach is presented in the study by

(Schaarschmidt et al., 2015), which delineates the es-

tablishment of a mediating component between the

service layer and the persistence layer. The term

polyglot persistence mediator (PPM) is deﬁned in the

work in question. It functions as a middleware, serv-

ing as a conduit between the two layers and directing

data to different query mechanisms. The work dis-

cusses polyglot persistence, but does not establish a

correlation between different types of databases; in-

stead, it focuses on a decoupled routing mechanism

among different types of database based on the SLA

rating, one at a time.

The study CloudMdsQL (Kolev et al., 2016) in-

vestigates the creation of a universal language for

querying different types of databases. The authors

propose that the use of a cohesive language serves as

a strategy for polyglot persistence. The operational

procedure entails employing translation components

to transform one query language into another, subse-

quently routing them to the relevant databases. These

studies are in their initial stages and have yielded only

a restricted number of prototypes for speciﬁc database

types. The recent study by (El Ahdab et al., 2024)

presents a signiﬁcant advancement and draws a com-

parison with CloudMdsQL, maintaining a similar ap-

proach utilizing a transformer coupled with multiple

APIs.

Apache Drill (Givre and Rogers, 2018) possesses

the characteristics of a Database Management Sys-

tem (DBMS), but is recognized as a query layer

that allows simultaneous access to multiple database

types. It functions as a tool, enabling polyglot queries

through the integration of different databases using

SQL language. Apache Drill lacks a high-level API

or a designated ORM. Its philosophy is to conceptu-

alize every type of storage as a table, regardless of its

actual nature, by executing queries on the raw data.

Thus, it operates not as an abstract framework but as

a practical tool for concrete implementations.

The research carried out by (Lu et al., 2018) de-

lineates a multi-model database known as the uni-

ﬁed database management system (UDMS), provid-

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

110

ing guidance for the development of an integrated sys-

tem for querying and storing data in a multi-model

framework, including relational, document, colum-

nar, graph, and JSON formats, among others, within

a singular management system.

Recent studies by (Holubov

a et al., 2021) and

(Van Landuyt et al., 2023) illustrate multimodel

database solutions and their operational challenges,

together with comparisons with separate database ap-

proaches, concluding that the optimal choice largely

depends on the type of application.

From the related works above, two horizons be-

come evident. First, when different databases are ac-

cessed simultaneously in the same application, but

separately, directly from the service layer and later

combined. Second, when the data are already com-

bined from different databases in the persistence

layer. Three approaches are identiﬁed as exempliﬁed

in the aforementioned work and according to the re-

view article by (Khine and Wang, 2019). They are as

follows: Applied to the Domain; Uniﬁed Query Lan-

guage; Framework/Middleware or Multimodel.

In light of these horizons and approaches, there

are no existing works that offer solutions to estab-

lish an API characterized by low coupling, unifor-

mity, and transparency for the developer to access and

correlate different types of databases. The Transpar-

ent Polyglot Persistence Framework Model (TPPFM)

is thus proposed.

3 BACKGROUND

The purpose of this section is to provide a concise

overview of the major concepts that are relevant to

understanding this work. This is an overview of issues

that are directly connected to the topic’s matter, and

these are included in the proposal for the TPPFM.

3.1 Polyglot Persistence Patterns

A pattern is a way of documenting the experience

by capturing successful solutions to recurring prob-

lems. Currently, standards are carefully documented

in many ﬁelds of computing and in several areas of

academia and industry. For the documentation of a

pattern, the context of the problem, the forces act-

ing on the problem solver, the logic, and the con-

text that results when applying the solution must be

recorded (Rising, 1998).

Regarding patterns of polyglot persistence, the

patterns identiﬁed in the work of Pereira et al., 2023

stand out (Pereira et al., 2023b). In the most ba-

sic identiﬁed pattern named ”Independent DAO,” the

client service accesses several independent persis-

tence modules, and the manipulation for data persis-

tence is carried out at the business layer. In the named

pattern ”Integrated Polyglot DAO,” data manipulation

is carried out in an integrated module that combines

various APIs. In the pattern named “DAO Compound

Mediator”, applied in this work, the APIs are inde-

pendent and all persistence manipulation is abstracted

and transparent to the client module. See Figure 1.

Figure 1: Polyglot persistence pattern. Adapted from

(Pereira et al., 2023b).

3.2 Metadata-Based Frameworks

In software engineering, a framework is a fundamen-

tal structure consisting of generic code that offers sup-

port and standard functionalities to create speciﬁc ap-

plications. A framework’s architecture is often event-

driven, allowing developers to insert their code within

extension points provided by the framework, often re-

ferred to as ”hooks” or ”customization points”. This

differs from traditional libraries where the developer

calls functions as needed. In a framework, overall

control of the ﬂow of execution is often inverted,

which is known as the ”Inversion of Control” (IoC)

principle or ”Hollywood Principle” (”don’t call us,

we’ll call you”) (Larman, 2012).

Metadata-based frameworks can be deﬁned as

frameworks that utilize metadata to affect the soft-

ware’s behavior. This metadata can be described us-

ing XML conﬁguration ﬁles, annotations, or other

kinds of description. The primary concept is that,

rather than explicitly customizing or altering the

source code, the developer provides information

A Framework Model for Supporting Transparent Polyglot Persistence with a Uniﬁed API and Extensible for Different Database Types

111

that the framework utilizes to determine the sys-

tem’s behavior (Lima et al., 2023). Some exam-

ples of metadata-based frameworks include Spring-

Data (Java), EsﬁngeQueryBuilder (Java), and Entity

Framework (C#/.NET).

Regarding the resources used in metadata-based

frameworks, an example in the Java programming

language is the concept of annotation. Annotations

are a programming tool that allows you to associate

metadata with speciﬁc elements of source code, such

as classes, methods, or attributes. The compiler can

use annotations to incorporate data during the compi-

lation process, or a framework can utilize them during

execution (Lima et al., 2023).

Annotations allow for more detailed information

about the behavior of source code elements with-

out directly modifying them. Furthermore, they im-

prove the architecture of the code by reducing dupli-

cations, centralizing the logic in one place, and auto-

matically implementing it throughout multiple parts

(Lima et al., 2023).

3.3 Framework Models

Framework models refer to a conceptual structure that

organizes and guides the development of solutions

within a given domain. Abstractly, the framework

model embodies a theoretical structure that dictates

the interaction of software components, the adherence

to standards, and the structuring of solutions to en-

hance efﬁciency, reuse, and consistency in develop-

ment. In this sense, the framework model establishes

the bridge between theory and practice.

The framework model encapsulates a standard ref-

erence architecture that speciﬁes the organization and

interaction of several components of the system. It

is not a concrete implementation, but a set of prin-

ciples that guides solution design within the speciﬁc

domain. Moreover, the framework model needs to im-

part practical knowledge about a speciﬁc domain and

codify it in a way that allows its reuse in any con-

text (De Souza et al., 2022).

4 TRANSPARENT POLYGLOT

PERSISTENCE FRAMEWORK

MODEL (TPPFM)

The concepts of separation of concerns, single re-

sponsibility, and dependency inversion provide devel-

opers with guidance on structuring modular, scalable,

and adaptable code. Taking into account those de-

sign principles, this section describes the proposed

TPPFM in this work.

4.1 Main Structure

Figure 2: TPPFM - Main strucuture.

The TPPFM has the structure shown in Figure 2

and is inspired by the DAO CompoundMediator poly-

glot persistence pattern, which can be seen in Figure

1. The service object of the business layer is man-

aged by a dynamic proxy that contains an instance of

QueryExecutor, tasked with executing queries in the

persistence layer, and MethodParser, which interprets

the method names to convert them into queries. In

the polyglot scenario, the persistence layer contains

the MediatorQueryExecutor, which encapsulates dis-

tinct instances of QueryExecutor for several types of

databases.

The proposed framework model assigns the re-

sponsibility of aggregating polyglot queries to Me-

diatorQueryExecutor. Starting with a uniﬁed lan-

guage, speciﬁcally ORM (object relation mapping),

ODM (object document mapping), other Object Map-

pings, and Internal DSL (Mangal, 2024), while as-

signing the particularities of each database to des-

ignated QueryExecutor’s, denoted in the model as

PriExecutor and SecExecutor instances.

4.2 Dynamic Behavior

Figure 3 provides a detailed description of the primary

ﬂow described by the TPPFM. A dynamic proxy is

used to pass an instance of the service across, which

enables the methods of the service to be invoked in a

natural manner without requiring any modiﬁcations

to the business layer. With regard to QueryExecu-

tor and MethodParser, it is the responsibility of the

proxy to generate instances of the various implemen-

tations that are dynamically accessible. During the

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

112

Figure 3: TPPFM - Main ﬂow behavior.

process of invoking a method, the MediatorQueryEx-

ecutor not only delegates the execution of the method

to the available implementations, but also the aggre-

gation. Then it returns the polyglot query result to the

requesting service. The aggregation originates from

the primary entity, for whom the dynamic proxy was

created, and subsequently extends to the secondary

entity, which is the one being associated.

5 REFERENCE

IMPLEMENTATION

The purpose of the reference implementation is to

demonstrate the feasibility of TPPFM. This work uti-

lizes the Esﬁnge Query Builder as a reference imple-

mentation. The Esﬁnge Query Builder is currently

in development, initially described in (Guerra, 2014),

and provides a fast and efﬁcient means to create a per-

sistent layer.

5.1 About Esﬁnge Query Builder

The Esﬁnge Query Builder is a framework that accel-

erates the development of persistence layers by utiliz-

ing method names to deduce database queries. The

primary advantage lies in the simplicity of the inter-

face, as it speciﬁes methods in alignment with the

query parameters, enabling automatic query genera-

tion at runtime (GUERRA et al., 2017).

The framework operates independently of persis-

tent technology and currently includes extensions for

JDBC, JPA, Cassandra, MongoDB, and Neo4J. In ad-

dition to fundamental searches, it facilitates the ex-

ecution of CRUD capabilities, including the saving,

deletion, and listing of records. The framework uti-

lizes internal DSL (domain-speciﬁc language) (Man-

gal, 2024), which enables the use of ”domain terms”

in methods to express queries, thereby improving ex-

pressiveness and readability.

This set of features aligns with the TPPFM by es-

tablishing a basis for implementation. The Esﬁnge

Query Builder is capable of operating with a single

database at a time. The proposed work improves

the Esﬁnge Query Builder and redesigns it to oper-

ate concurrently with several databases in a polyglot

approach.

5.2 TPPFM Implementation - General

View

The Esﬁnge Query Builder was restructured to meet

the TPPFM, taking advantage of its characteristics al-

ready adherent to the model, being readjusted to op-

erate with multiple databases in order to retrieve in-

formation from the same domain that is in databases

of different types.

Prior to this restructure, Esﬁnge Query Builder

permitted the utilization of only a single database at

once, rendering the simultaneous usage of multiple

database types unfeasible. The rewriting of the frame-

work to incorporate TPPFM enabled its ability to in-

terface with several diverse databases concurrently

through a standardized API among entities within the

same domain.

For illustration, consider a basic user registration

process in which users are stored in a PostgreSQL

database. Additionally, note that there exists a Mon-

goDB database that stores user proﬁles, which are

semi-structured data retaining access permissions and

A Framework Model for Supporting Transparent Polyglot Persistence with a Uniﬁed API and Extensible for Different Database Types

113

speciﬁc conﬁgurations for application functionalities

accessible to the user.

To transparently obtain this information from a

uniﬁed API, we can utilize the framework demon-

strated in the code in Listing 1, Listing 2, Listing 3,

and Listing 4.

The User class in Listing 1 is a mapped en-

tity following the JPA speciﬁcation using Hibernate

ORM (Tudose et al., 2023). It contains some spe-

ciﬁc TPPFM annotations, which are @Persistence-

Type, @PolyglotOneToOne, and @PolyglotJoin. The

class Proﬁle in the Listing 2 was mapped by Morphia

ODM (Scherzinger and Sidortschuck, 2020), and con-

tains the annotation @PersistenceType. The interface

ExampleDAO in the Listing 3 contains the annotation

@TargetEntity. Finally, the Main class in Listing 4

that invokes and obtains a dynamic proxy for an in-

stance of ExampleDAO does not contain speciﬁc an-

notations. All the speciﬁc annotations used here and

the others in the framework are detailed in section 5.3.

1 //omitted code

2 import ja v ax . p e r s is t e n ce . En t it y ;

3 import ja v ax . p e r s is t e n ce . Id ;

4 import ja v ax . p e r s is t e n ce . Tr a n si e n t ;

5 @ E n ti t y

6 @P e r s i s te n c e T y pe ( v a lu e = " JP A1 " , se c o nd a r y = " MO N GO D B "

→)

7 public class Us er {

8 @I d

9 private I n te g e r id ;

10 private S t ri n g l o gi n ;

11 private S t ri n g p a s sw o r d ;

12 private O b j ec t I d p r o fi l e I d ;

13 @ T ra n s i en t

14 @ P o ly g l o t O neT o O n e ( r e f er e n c e d En t i t y = P ro f ile .class)

15 @ P ol y g l o tJ o i n ( nam e = " pr o f i le I d ",

→r efe r e n c e d A tt r i b u t e N am e = " id " )

16 private P r of i l e p r of i l e ;

17 //getters and setters

18 }

Listing 1: User class.

1 //omitted code

2 import d ev . m o rp h i a . an n o t at i o n s . En t ity ;

3 import d ev . m o rp h i a . an n o t at i o n s . Id ;

4 @ E n ti t y

5 @P e r s i s te n c e T y pe ( " M O N GO D B ")

6 public class Pr o f il e {

7 @I d

8 private O b j ec t I d id ;

9 private S t ri n g c o nfi g s ;

10 private S t ri n g p e r m is s i o ns ;

11 //getters and setters

12 }

Listing 2: Proﬁle class.

1 //omitted code

2 @T a r g e tE n t i t y ( Us er .class)

3 public interface E xa m p l eD A O extends Re pos ito ry < Use r > {

4 Lis t < Us er > g e t Use r B y L og i n ( S t ri n g l og i n ) ;

5 //omitted code

6 }

Listing 3: ExampleDAO interface.

1 //omitted code

2 public class Ma in {

3 public static void ma in ( S t ri n g [] ar gs ) {

4 va r e xam p le = Q u e r yB u i l d er . c r eat e ( E xa m p l eD A O .class

→) ;

5 va r us e r = ex a mp l e . g et U s e r By N a m e (" John ") ;

6 S ys t em . o ut . pr i n tl n ( u se r ) ;

7 }

8 }

Listing 4: Main class - business layer.

In this example case, when executing exam-

ple.getUserByLogin(”john”), the user object will

contain data from both the PostgreSQL database and

the MongoDB database ﬁltered by the user’s login

(Listing 4, lines 5-6). The access is transparent to the

business layer, and the delegation of queries and ag-

gregation is performed in the persistence layer with-

out strong coupling.

5.3 TPPFM Implementation - Internal

Structure

The Esﬁnge Query Builder framework is divided into

modules and has been updated to support the poly-

glot approach of TPPFM. It contains extension points

that utilize the Java Service Provider Interface (SPI)

functionality, allowing the discovery and use of ser-

vice implementations at runtime. The main module

is called Esﬁnge Query Builder Core. This module

contains the main machinery of the framework. Other

modules extend and use the main module to invoke

concrete implementations for each type of database.

The evolution, to the TPPFM, demanded the cre-

ation of annotations, as cited in subsection 5.2. Next,

an explanation about the usefulness of all the notes

available for the polyglot context.

• @TargetEntity - deﬁnes the main entity for the

interface for which the dynamic proxy is created;

• @PersistenceType - deﬁnes the type of persis-

tence assigned to each entity being manipulated;

• @QueryExecutorType - extension annotation

that deﬁnes with which type of persistence a spe-

ciﬁc implementation of QueryExecutor is operat-

ing;

• @PolyglotJoin - speciﬁes a polyglot relation-

ship within the primary entity, indicating the ﬁeld

name that references the secondary entity, or oth-

erwise;

• @PolyglotOneToOne - denotes the one-to-one

relationship type, signifying that for each instance

of the primary entity, there exists a singular corre-

sponding instance of the secondary entity. It can

be delineated from left to right or from right to

left; that is, there may be a primary ﬁeld that links

an identiﬁer to the secondary, or vice versa;

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

114

• @PolglotOneToMany - establishes the one-to-

many relationship, indicating that for each in-

stance of the main entity, there are several corre-

sponding instances of the secondary object;

The Esﬁnge Query Builder Core, which maps

to the TPPFM, creates the dynamic proxy from the

QueryBuilder class and provides some hotspots, such

as the QueryExecutor and MethodParser interfaces.

The QueryBuilder class sets up a list of available im-

plementations for each interface and creates an in-

stance of MediatorQueryExecutor, which, in turn,

contains 2 attributes that deﬁne the PriExecutor (im-

plementation of persistence for the primary entity -

e.g.: JPA) and SecExecutor (implementation of per-

sistence for the secondary, e.g.: MongoDB).

The identiﬁcation of the primary entity is deter-

mined by the attribute of named “value” obtained

from the @TargetEntity annotation of the class for

which the dynamic proxy is generated. We retrieve

the persistence type from @PersistenceType. So, the

operations are ﬁrst done on the primary entity, then on

the secondary one, and ﬁnally on both entities simul-

taneously, based on the deﬁned polyglot relationship,

as shown in the model’s Figure 3.

The modules are conﬁgured with SPI and can be

seen from Listing 5 with the module-info ﬁle of the

Esﬁnge Query Builder JPA, and Listing 6 with the

module-info.java ﬁle of the Esﬁnge Query Builder

MongoDB. In this way, the main module obtains spe-

ciﬁc implementations for the interfaces at run-time

according to the dependencies injected into the client

project. As seen in the module-info.java ﬁles, all con-

tain implementations available for the special Repos-

itory interface, which has also become polyglot by

providing polyglot operations for CRUD.

1 m o du l e qu e r y b ui l d e r . jp a o ne {

2 r e q ui r e s t r a n si t i v e q u e r ybu i l d er . c o re ;

3 r e q ui r e s j av a . p e rs i s t en c e ;

4 e x p or t s ef . qb . jpa1 ;

5 o p en s ef . qb . jp a1 ;

6 u se s ef . qb . jp a1 . E n t i t yM a n a g e r P ro v i d e r ;

7 p r o vi d e s ef .qb . core . R e po s i t or y wi t h

8 ef . q b . jpa 1 . J PAR e p o s it o r y ;

9 p r o vi d e s ef .qb . core . e xe c u to r . Q u e r yEx e c u tor w it h

10 ef . q b . jpa 1 . J P AQ u e r y E xe c u t o r ;

11 }

Listing 5: Esﬁnge Query Builder JPA - module-info.

1 m o du l e qu e r y b ui l d e r . mo n g od b {

2 r e q ui r e s t r a n si t i v e q u e r ybu i l d er . c o re ;

3 r e q ui r e s m o r ph i a ;

4 //omitted code

5 e x p or t s ef . qb . mo n go d b ;

6 o p en s ef . qb . mo n go d b ;

7 u se s ef . qb . mo n god b . D a t ast o r e P r ov i d e r ;

8 p r o vi d e s ef .qb . core . R e po s i t or y wi t h

9 ef . q b . mo ngo d b . M ong o D B R e po s i t o r y ;

10 p r o vi d e s ef .qb . core . e xe c u to r . Q u e r yEx e c u tor w it h

11 ef . q b . mo ngo d b . M o ng o D B Q u e ry E x e c u t or ;

12 //omitted code

13 }

Listing 6: Esﬁnge Query Builder MongoDB - module-info.

This approach allows for the integration of addi-

tional database implementations as dependencies with

little coupling, improving maintainability, and align-

ing with the pattern depicted in Figure 1 in DAO Com-

pound Mediator.

5.4 TPPFM Tests

The examples presented in section 5.2 and the

module-info ﬁles in section 5.3 have been taken from

MongoDB and JPA implementations. It is crucial to

emphasize that TPPFM underwent implementation

testing with various cases, including PostgreSQL

and MongoDB, PostgreSQL and Cassandra, as

well as MongoDB and Cassandra. These tests

demonstrate that the proposed mechanism func-

tions effectively in various database types and can

seamlessly correlate them, thereby improving the

capabilities of Esﬁnge Query Builder CORE. The

tests are accessible at the following link: https:

//github.com/EsﬁngeFramework/querybuilder/tree/

master/PolyglotDemo/PolyglotDemoDataGenerator.

6 CASE STUDY

We developed a case study to implement the TPPFM

in a polyglot setting using real data. To achieve

this, it was crucial to evaluate the utilization of an-

notations, illustrate the relationship between classes

and utilized frameworks through the Design Structure

Matrix (DSM) (Eppinger, 2012), and conﬁrm the fea-

sibility of the TPPFM.

6.1 Method

The selection of case studies involved the identiﬁca-

tion of databases of different types that contain data

within the same application domain. The authors

used two databases provided by CEMADEN (Na-

tional Centre for Monitoring and Early Warning of

Natural Disasters). It is a Brazilian government or-

ganization that monitors in real time and distributes

alerts about the probability of natural disasters, such

as landslides, ﬂash ﬂoods, and ﬂoods.

To validate the proposed framework, we se-

lected databases from CEMADEN for their ability to

demonstrate a heterogeneous environment compris-

ing both relational and NoSQL databases. This choice

was made because we need to recreate a real-life sit-

uation where multiple persistence models exist at the

same time. This will allow us to test how well the

framework works with others and in real-life situa-

tions.

A Framework Model for Supporting Transparent Polyglot Persistence with a Uniﬁed API and Extensible for Different Database Types

115

The authors developed the case study’s source

code in Java and integrated it into the CEMADEN

software ecosystem, allowing access to real-time-

generated data. We added Esﬁnge Virtual Lab (ex-

plained in section 6.2) to the CEMADEN software

ecosystem and uploaded JAR modules to make it eas-

ier for the integration of study materials and pro-

duce insightful results for CEMADEN scientists us-

ing databases.

6.2 Esﬁnge Virtual Lab

The Esﬁnge Virtual Lab tool has been used for the

front-end of the case study. It is open-source and uses

substantially the Esﬁnge query builder. It has been

updated to take into account TPPFM, enabling a poly-

glot approach for the present work.

The Esﬁnge Virtual Lab is detailed in a com-

prehensive way in the publication by (Pereira et al.,

2023a) and can be found at https://github.com/

EsﬁngeFramework/virtual-lab/releases. It is a virtual

laboratory featuring a metadata-driven API for the

development of dynamic software components. The

platform, developed in Java, seeks to streamline sys-

tem prototype, enabling developers to focus on busi-

ness concepts rather than information retrieval or in-

terface programming. The Esﬁnge Virtual Lab em-

ploys declarative programming techniques to facili-

tate the fast creation of data visualizations, includ-

ing tables, charts, and maps, with minimal coding re-

quirements. The components, encapsulated as JAR

ﬁles, can be dynamically loaded and retrieved, facili-

tating code reuse. For the case study presented here, it

is important to understand the following annotations

that were used.

• @ServiceDAO - deﬁnes the access conﬁgurations

for the database of the main entities and provides

data access services;

• @PolyglotConﬁg - deﬁnes the access conﬁgura-

tions for the secondary entities’ database (devel-

oped in the Esﬁng VirtualLab framework due to

the case study);

• @ServiceClass - deﬁnes a control class that can

contain available service methods;

• @ServiceMethod - deﬁnes that a method will be

made available as a service that can be invoked;

• @Inject - allows injecting a service or data class

into another service class. Example: injected data

access class;

• @BarChartReturn - returns the result of the ser-

vice method as a bar chart;

• @TableReturn - returns the result of the service

method as a data table.

6.3 Context

The case study was formulated with regard to the

realm of natural disasters. Two databases were pro-

vided at CEMADEN. A relational database utilizing

PostgreSQL and a NoSQL database utilizing Mon-

goDB.

The relational database stored alert data. An alert

includes a creation date, a termination date, details on

the probable natural disaster, the severity rating, and

the city location. In the case of CEMADEN, notiﬁca-

tions are sent by cities, which include a unique identi-

ﬁcation code established by IBGE (Brazilian Institute

of Geography and Statistics).

The NoSql database comprises data from the

BATER (Statistical Territorial Base of Risk Areas),

in verbatim transcription, Statistical Territorial Base

of Risk Areas (Instituto Brasileiro de Geograﬁa e Es-

tat

ıstica (IBGE), 2018). The BATER technique is

comprehensive and encompasses numerous speciﬁc

topics relevant to censitary statistics and natural disas-

ters. For the purposes of this case study, it is adequate

to recognize that a BATER delineates a geographical

area within a city, intersecting risk area delimitation

data with census sectors to estimate, with varying de-

grees of precision, the probable number of individuals

exposed to natural disasters.

The BATER data do not cover all Brazilian cities

or all risk regions, comprising 183 variables about

residents, predominantly quantitative, categorized by

gender, age, and other factors. It encompasses 135

attributes of households, including property owner-

ship status and urban services obtained, among others.

Furthermore, not all variables possess values for ev-

ery existing BATER. This illustrates a versatile struc-

ture of the data and the utilization of a non-relational

database.

The alert data are operational, which means that

they are integral to the daily operations of CE-

MADEN. The BATER data are ﬁnely organized

datasets that offer signiﬁcant insights about Brazilian

cities within the researched context, utilized by CE-

MADEN predominantly for research endeavors and,

to a lesser degree, for operational applications. How-

ever, there is no mechanism for correlated data within

the same domain in a manner that is feasible for de-

velopers and capable of generating conﬁgurable and

dynamic services that address both the research and

operational requirements of end users.

Consequently, the suggested case study seeks to

fulﬁll this identiﬁed requirement. In this context, an

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

116

Alert is the primary entity, and an Alert possesses a

list of Bater’s. The variable investigated was the to-

tal number of individuals present in a Bater. Conse-

quently, it may be feasible to determine the number

of individuals possibly at risk within the city by ag-

gregating the population ﬁgures from all the Bater’s

in that area.

6.4 Implementations

The implementation of the case study required the

creation of four Java classes, as illustrated in the di-

agram in Figure 4. The class AlertService utilizes

a dynamic proxy to inject an instance of AlertDAO.

AlertDAO includes service methods that access data

purely through method declarations, using a declara-

tive API. In AlertDAO, the annotation @TargetEntity

designates Alert as the primary entity. Alert is an-

notated with @PersistenceType, indicating the persis-

tence type for the primary and secondary entities. In

the scenario of the case, Bater is utilized as the sec-

ondary entity, associated with Alert through @Poly-

glotOneToMany. The ibgecode attribute in Alert con-

tains the same value as the citycode in Bater, allowing

for the association to be made.

Figure 4: Class diagram from case study.

6.5 Results

6.5.1 Feasibility Results

The feasibility results are illustrated in Figure 5.

The focus is on the visualization in tabular and bar

chart formats corresponding to the two service meth-

ods provided in AlertService, namely listData and

getChart. A list of alerts is presented that indicates

the probable number of vulnerable individuals in Rio

Grande do Sul, Brazil, during the period from April

1, 2024, to July 1, 2024. A bar graph illustrates the

progression of alerts in relation to the probable num-

ber of vulnerable individuals, taking into account the

alert creation date and its expiration.

This case study speciﬁcally examined the period

during which numerous alerts spread to the popula-

tion, which ultimately resulted in signiﬁcant catastro-

phes in the Rio Grande do Sul region due to excessive

rainfall over several days.

6.5.2 Implementation Analysis

Concerning the association of classes with frame-

works, Table 1 delineates the number of annotations

assigned to each class, with EQB representing Esﬁnge

Query Builder, JPA denoting Java Persistence API,

MOR indicating Morphia ODM, EVL referring to Es-

ﬁnge Virtual Lab, and POL for classes using poly-

glot annotations. This shows high modularity and low

coupling of the solution.

Table 1: Number of annotations used from each framework.

JPA MOR EQB POL EVL

AlertService 0 0 0 0 6

AlertDAO 0 0 2 1 3

Alert 5 0 0 3 0

Bater 0 3 0 1 0

From the TPPFM perspective, observe that the

AlertService class, which represents the business

layer, lacks any polyglot annotations, allowing the de-

veloper to concentrate solely on the business logic.

In the persistence layer, the majority of polyglot an-

notations are found in the Alert class, which rep-

resent the primary entity, denoting the persistence

type (@PersistenceType) and the association anno-

tations (@PolyglotOneToMany and @PolyglotJoin).

In the AlertaDAO class, it is necessary to specify

only the target primary entity (@TargetEntity). In the

Bater class, only persistence type requires declaration

(@PersistenceType).

Figure 6 shows the DSM corresponding to the four

classes created. We constructed the DSM using the

Esﬁnge Virtual Lab packages, Esﬁnge Query Builder

CORE packages, Esﬁnge Query Builder JPA pack-

ages, and Esﬁnge Query Builder MongoDB packages.

From the perspective of coupling, focus on the col-

orful rectangles highlighted in the ﬁgure. The pur-

ple area delineates the dependencies for Esﬁnge Vir-

tual Lab, speciﬁcally, the service speciﬁcation and the

DAO annotations for the front-end. However, the pri-

mary goal of this work is to focus on the elements in-

A Framework Model for Supporting Transparent Polyglot Persistence with a Uniﬁed API and Extensible for Different Database Types

117

Figure 5: Table with a list of alerts and probable number of vulnerable individuals

(*left)

. Bar chart showing the evolution of

alerts versus probable number of vulnerable individuals

(*right)

Figure 6: Case Study - Dependency Structure Matrix.

dicated by the green rectangles labeled A to D. These

are, in reality, the dependencies of the TPPFM imple-

mentation. In A, there is a dependency on AlertDAO

for the Repository interface. From B to D, we have

only annotations. In B, we have @TargetEntity for

AlertDAO. In C, utilize @PersistenceType, @Poly-

glotOneToMany, and @PolyglotJoin for Alert. In D

@PersistenceType for Bater. The framework exhibits

high modularity, since applications rely just on meta-

data rather than concrete implementations.

7 DISCUSSIONS

7.1 Case Study Conclusions

The case study effectively illustrated the operation of

the reference implementation for TPPFM. The imple-

mentation used real data. Some difﬁculties merit con-

sideration. The CEMADEN relational data model is

extensively normalized and includes multiple tables,

signiﬁcantly increasing the difﬁculty of mapping. For

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

118

the case study, the authors opted to develop an Alert

View, which allowed the study to be carried out with-

out compromising the coherence of the data.

The Esﬁnge VirtualLab facilitated the fast gener-

ation of visualizations for front-end development. It

internally incorporates the Esﬁnge Query Builder, and

the utilization of the @ServiceDAO and @Polyglot-

Conﬁg annotations facilitates fast setting of database

access.

In general, the case study may be developed with

minimal source code, which is advantageous for the

framework given its signiﬁcant internal complexity.

The advantages of this design can enhance the devel-

oper experience due to minimal integration between

frameworks, a uniﬁed API, enhanced maintainability,

and autonomy.

7.2 Limitations

The case study has limitations, including the testing

of only two types of databases and the absence of

performance analysis. A study involving two distinct

database types is adequate to validate the framework’s

operation, as the particular implementations do not di-

rectly inﬂuence the Esﬁnge Query Builder CORE; in-

stead, they serve as extensions. However, the frame-

work was designed to allow polyglot operation with

a maximum of two, independent of their type, simul-

taneously. Nevertheless, instances involving three or

more databases within the same topic are infrequent

in the literature, prompting the authors to use this ap-

proach.

8 CONCLUSIONS

The research work delineated the deﬁnition of the

TPPFM, establishing a conceptual framework as a

foundation for the transparent implementation of

polyglot persistence for developers. Using the Esﬁnge

Query Builder as a reference framework, it was feasi-

ble to evolve it for polyglot functionality alongside its

existing capabilities that correspond to the requested

work.

A case study was created using real databases that

illustrated the operation of the framework. The in-

corporation of Esﬁnge VirtualLab facilitated the fast

acquisition of data visualization that connected infor-

mation across PostgreSQL and MongoDB databases

within a uniﬁed domain model, enabling seamless

polyglot operations through various ORM mappings.

The framework features a cohesive declarative API

that abstracts the utilization of diverse APIs from sev-

eral databases.

The results indicated that the framework promotes

development with minimal coding and substantial

modularity. Future endeavors will focus on creat-

ing performance assessments and, crucially, execut-

ing experiments to evaluate the developer’s experi-

ence, yielding suggestions for enhancing the frame-

work and broadening its capabilities.

REFERENCES

de Ara

ujo, A. M. C., Times, V. C., and da Silva, M. U.

(2016). Polyehr: A framework for polyglot persis-

tence of the electronic health record. In Proceed-

ings on the International Conference on Internet Com-

puting (ICOMP), page 71. The Steering Committee

of The World Congress in Computer Science, Com-

puter . . . .

De Souza, W. S., Pereira, F. O., Albuquerque, V. G., Mel-

egati, J., and Guerra, E. (2022). A framework model

to support a/b tests at the class and component level.

In 2022 IEEE 46th Annual Computers, Software, and

Applications Conference (COMPSAC), pages 860–

865, Los Alamitos, CA, USA. IEEE.

Eisenhuth, P. and Jablonski, S. (2022). Knowledge-based

recommendation for polyglot persistence. In CDMS@

VLDB.

El Ahdab, L., Megdiche, I., P

eninou, A., and Teste, O.

(2024). Uniﬁed models and framework for querying

distributed data across polystores. In International

Conference on Research Challenges in Information

Science, pages 3–18. Springer.

Eppinger, S. (2012). Design Structure Matrix Methods and

Applications. MIT Press.

Givre, C. and Rogers, P. (2018). Learning Apache Drill:

Query and Analyze Distributed Data Sources with

SQL. ” O’Reilly Media, Inc.”.

Guerra, E. (2014). Designing a framework with test-driven

development: A journey. IEEE software, 31(1):9–14.

GUERRA, E. M., BATISTA, J. A., and NASCIMENTO,

L. W. T. (2017). Esﬁnge query builder - frame-

work de acesso a dados para diferentes paradigmas

de banco. In CBSoft VIII Congresso de Software

Brasileiro, pages 65–72, Porto Alegre, RS, Brazil. So-

ciedade Brasileira de Computac¸

ao (SBC).

Holubov

a, I., Contos, P., and Svoboda, M. (2021). Multi-

model data modeling and representation: State of the

art and research challenges. In Proceedings of the 25th

International Database Engineering & Applications

Symposium, pages 242–251.

Instituto Brasileiro de Geograﬁa e Estat

ıstica (IBGE)

(2018). Populac¸

ao em

Areas de Risco no Brasil.

IBGE, Rio de Janeiro.

Jim

enez-Peris, R., Patino-Martinez, M., Brondino, I., and

Vianello, V. (2016). Transactional processing for

polyglot persistence. In 2016 30th International Con-

ference on Advanced Information Networking and Ap-

plications Workshops (WAINA), pages 150–152, Pis-

cataway, NJ, USA. IEEE, IEEE.

A Framework Model for Supporting Transparent Polyglot Persistence with a Uniﬁed API and Extensible for Different Database Types

119

Kaur, K. and Rani, R. (2015). Managing data in health-

care information systems: many models, one solution.

Computer, 48(3):52–59.

Khine, P. P. and Wang, Z. (2019). A review of polyglot per-

sistence in the big data world. Information, 10(4):141.

Kolev, B., Valduriez, P., Bondiombouy, C., Jim

enez-Peris,

R., Pau, R., and Pereira, J. (2016). Cloudmdsql:

querying heterogeneous cloud data stores with a com-

mon language. Distributed and parallel databases,

34(4):463–503.

Lajam, O. and Mohammed, S. (2022). Revisiting polyglot

persistence: From principles to practice. International

Journal of Advanced Computer Science and Applica-

tions, 13(5).

Larman, C. (2012). Applying UML and patterns: an intro-

duction to object oriented analysis and design and in-

terative development. Pearson Education India, Delhi,

India.

Lima, P., Pereira, N. S., Gomes, E., Guerra, E., and

Meirelles, P. (2023). Annotation visualizer: A soft-

ware visualization tool for code annotations. Software

Impacts, 16:100491.

Lu, J., Liu, Z. H., Xu, P., and Zhang, C. (2018). Udbms:

road to uniﬁcation for multi-model data management.

In International Conference on Conceptual Model-

ing, pages 285–294, Cham, Switzerland. Springer,

Springer.

Mangal, H. (2024). Cpsl: A domain-speciﬁc language for

modelling the behaviour of cyber-physical systems.

B.S. thesis, University of Twente.

Pereira, F., Franc¸a, D., Paschoal, V., Nardes, M., Rosa,

R. R., and Guerra, E. (2023a). Esﬁnge virtual lab—a

virtual laboratory platform with a metadata-based api

and based on dynamic component. IEEE Access,

11:143167–143181.

Pereira, F., Guerra, E., and Rosa, R. R. (2023b). Patterns

for polyglot persistence layer. In Proceedings of the

29th Conference on Pattern Languages of Programs,

PLoP ’22, USA. The Hillside Group.

Prasad, S. and Avinash, S. (2014). Application of polyglot

persistence to enhance performance of the energy data

management systems. In 2014 International Confer-

ence on Advances in Electronics Computers and Com-

munications, pages 1–6. IEEE.

Rising, L. (1998). The patterns handbook: Techniques,

strategies, and applications, volume 13. Cambridge

University Press.

Schaarschmidt, M., Gessert, F., and Ritter, N. (2015). To-

wards automated polyglot persistence. Datenbanksys-

teme f

ur Business, Technologie und Web (BTW 2015),

241:73–82.

Scherzinger, S. and Sidortschuck, S. (2020). An empirical

study on the design and evolution of nosql database

schemas. In Conceptual Modeling: 39th Interna-

tional Conference, ER 2020, Vienna, Austria, Novem-

ber 3–6, 2020, Proceedings 39, pages 441–455, Vi-

enna, Austria. Springer, SpringLink.

Srivastava, K. and Shekokar, N. (2016). A polyglot persis-

tence approach for e-commerce business model. In

2016 International Conference on Information Sci-

ence (ICIS), pages 7–11. IEEE.

Tudose, C., Bauer, C., and King, G. (2023). Java per-

sistence with spring data and hibernate. Simon and

Schuster, New York, NY, USA.

Van Landuyt, D., Benaouda, J., Reniers, V., Raﬁque, A.,

and Joosen, W. (2023). A comparative performance

evaluation of multi-model nosql databases and poly-

glot persistence. In Proceedings of the 38th ACM/SI-

GAPP Symposium on Applied Computing, pages 286–

293.

Villac¸a, L. H., Azevedo, L. G., and Bai

ao, F. (2018). Query

strategies on polyglot persistence in microservices. In

Proceedings of the 33rd Annual ACM Symposium on

Applied Computing, pages 1725–1732, Pau, France.

ACM.

Wiese, L. (2015). Polyglot database architectures= poly-

glot challenges. In LWA, pages 422–426, G

ottingen,

Germany. University of G

ottingen.

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

120