mentioned problem, and (ii) how to adapt and update
mappings to handle data sources’ changes. The orig-
inality of our work is the novel automated mapping
maintenance that we propose, based on deep learning
techniques. This article is organized as follows: Sec-
tion 2 presents related works. Our proposal is detailed
in section 3. Section 4 presents the implementation
and discusses the evaluation of our proposal. Lastly,
we conclude and deliver the future works in section 5.
2 STATE OF THE ART
Our study of recent works has focused on schema
matching and mapping maintenance as a solution for
change in the metadata of data sources.
Schema matching makes use of name, descrip-
tion, type of data, constraint, and schema struc-
ture in order to identify the match between two at-
tributes of data source’s schema. (Do, 2007) de-
scribes the architecture, functionality, and evaluation
of the schema matching system COMA++ (Combin-
ing Matchers). COMA++ represents a generic and
customised system for semi-automatic schema match-
ing, which combines different match algorithms in a
flexible way. But even though the work shows good
results, it stays limited due to the unavailability of
metadata and schema of all data-sources. Further-
more, the use of metadata is not always appropriate to
schema matching process. This issue has been related
in many works. That being said, work on schema
matching has shifted towards Instance based match-
ing. This approach employs the available instances
as a source to identify the correspondences between
schema attributes. (Munir and al, 2014) focuses
on finding similarities/matchings between databases
based on the instance approach. It uses the databases’
primary keys. The proposed approach consists of two
main phases : Row Similarity and Attribute Similar-
ity. This work shows significant results in terms of
matching accuracy. However, it remains limited by
its applicability to only databases.
While working on databases, (Sahay and al, 2019)
and (Mehdi and al, 2015) use a hybrid approach, using
schemas and data provided to manage matchings, by
using machine learning. They try to work on one-
to-one matchings, and introduce a dictionary for one-
to-many matchings. The results seem sufficient for
simple use cases, but show mediocre performance in
a complex mappings while supporting only databases.
Concerning other formats, (Mohamed-Amine and
al, 2017) work on Json datasets. They focus solely
on schema inference, but they show great potential
for schema matching. They deal with the problem
of inferring a schema from massive JSON datasets,
by identifying a JSON type language which is simple
and expressive enough to capture irregularities and
to give complete structural information about input
data. (Ahmad Abdullah Alqarni, 2017) introduces
two approaches to improve XML Schema matching
efficiency: internal filtering and node inclusion. In
the former, it detects dissimilar elements at all schema
levels at early stages of the matching process. And In
the latter, it detects dissimilarity between elements on
leaf nodes only using their ancestor degree of simi-
larity and then exclude them from the next phase of
matching process.
In another context, several researches focused on
mappings maintenance to fix data availability. This
means manipulating and updating mappings to handle
data sources’ and ontologies’ change.
In that matter, (Cesar and al, 2015) and
(N. Popitsch, 2018) take into account the definition
of established mappings, the evolution of Knowl-
edge Organization Systems (KOS) and the possible
changes that can be applied to the mappings. they de-
fine complete frameworks based on formal heuristics
that drives the adaptation of KOS mappings. These
studies show great results in terms of mappings main-
tenance, but they require a lot of human interven-
tion, with a repeated the integration process, which
re-value the need for a mapping maintenance solution.
(Khattaka and al., 2015) proposes an other ap-
proach towards mappings evolution regeneration. It
focuses on finding alignments between ontologies.
The paper proposes a mapping reconciliation ap-
proach between the updated ontologies that have been
found to take less time to process compared to the
time of existing systems when only the changed re-
sources are considered and also eliminates the stale-
ness of the existing mappings. Even though this work
presents intresting work concerning mappings main-
tenance, it focuses only on ontology descriptions,
without taking into consideration the data change.
2.0.1 Synthesis
The literature revue on schema matching and map-
ping maintenance shows that the existing works fo-
cused either on manipulating ontologies or by chang-
ing data sets. But since our focus is on mapping files,
no study has tried to manipulate the files directly. That
being said, our aim is to focus on how to manipulate
and update mapping files, following a change in data
sources’ metadata.
ICSOFT 2022 - 17th International Conference on Software Technologies
470