the legacy system. This task involves using a
specific metamodel for each artefact. After that, in
the second stage, a KDM model is built (using
model transformations) from the PSM models
recovered from the legacy system. In this case, the
KDM model plays the role of a PIM model.
Therefore, this model abstracts any technology-
specific aspect of the legacy system. It should be
borne in mind that obtaining KDM models does not
end in the restructuring stage, because it is possible
to restructure the KDM model itself. For example,
transformations can be tailored between KDM
layers. Finally, the forward engineering stage
accomplishes building new and improved
information systems. This stage involves PSM
models to represent specific aspects related to the
technological nature of each target system. In this
way, the modernization process is completed
according to the horseshoe model.
The aim of this modernization process is to
modernize legacy systems focused on code and
database as exclusive software artefacts. These
artefacts undoubtedly determine three KDM models
that must be obtained in the reverse engineering
stage of this modernization process: (i) KDM
Inventory Model is based on the Source Package of
KDM. It enumerates physical artefacts of the legacy
system and defines the mechanism of traceability
links between all the KDM elements and their
original representation in the legacy source code. (ii)
KDM Code Model supports both Code Package and
Action Package. This model aims to represent
program elements and their associations at the
implementation level. It includes elements supported
by several programming languages such as
sentences, operators, conditions, associations,
control and data flows. (iii) KDM Data Model
represents data manipulation in legacy systems.
Data Model is based on Data Package and uses the
foundations provided by Code Model related to the
representation of simple data types. Also, this model
can depict the relational databases used by the
legacy system.
In addition to these models, the schema
elicitation technique involves other models in the
reverse engineering stage of this ADM process (see
the shaded part of Figure 1). The database schema is
elicited from the SQL embedded in the source code
by means of the proposed technique, and thus it
generates an SQL Sentences Model by means of the
static analysis of legacy source code. The static
analysis activity also produces the Inventory Model
and Code Model. After that, the Database Schema
Model, a model that represents the minimal schema
of the database, is obtained through the model
transformation from the SQL Sentences Model.
Finally, the needed KDM Data Model is obtained
from the Database Schema Model.
At this point, both the source code and the
database are represented according to KDM.
Therefore the restructuring and forward engineering
stages can be carried out in order to generate the
modernized version of the legacy systems (see
Figure 1).
Figure 1: Schema elicitation technique based on ADM.
In order to obtain the SQL Sentences Model the
technique analyses the legacy source code for
embedded SQL sentences by means of a parser. This
parser is a syntactical analyser that exhaustively
scans source code. When the parser finds an SQL
sentence, it translates that sentence into a model
according to a metamodel of the DML (Data
Manipulation Language) of SQL-92 that has been
developed.
The metamodel modeling the syntax of the SQL-
92 DML (ISO/IEC, 1992). It can represent the SQL
operations such as Insert, Select, Update and Delete
together with search conditions.
After obtaining the SQL Sentences Model
through static analysis, the Database Schema Model
must be obtained by mean of a model
transformation. These models of relational database
schemas are represented through a metamodel
according to the SQL-92 standard (ISO/IEC, 1992).
Deductions of the minimal database schema are
based on a set of rules developed specifically for this
purpose. These rules recover only a subset of the
database schema elements that are handled by the
SQL sentences embedded in the source code.
Rule 1. The tables that appear in any SQL sentence
(Insert, Select, Update or Delete) as either source or
target clauses (From, Set, Into, and so on) are
created as tables in an induced database scheme.
Rule 2. The columns that are selected, added,
deleted or updated in the SQL sentences are created
in the corresponding tables. These tables have
ICEIS2012-14thInternationalConferenceonEnterpriseInformationSystems
128