How to Guarantee Analysis Results Coherence after Data Warehouse
Schema Changes Propagation towards Data Marts?
Noura Azaiez and Jalel Akaichi
Department of Computer Science, ISG-University of Tunis, Le Bardo, Tunisia
Keywords: Data Warehouse, Evolution Operations, Data Mart, Propagation Rules, Schema Versioning.
Abstract. Data Warehouse, accompanied with Online analytical processing, is considered as the core of the modern
Decision support systems. The emergence of new analytical requirements and changes in organization busi-
ness processes push the underlying information sources, destined to feed the data warehouse, to modify not
only their data, but also their structure. This, obviously, has a direct impact on Data Warehouse and its asso-
ciated Data Marts. Maintaining Data Warehouse structure becomes, therefore, a must; however, it is not suf-
ficient. In fact, evolutions performed on the Data Warehouse schema have to be propagated on the related
Data Marts in order to minimize costs, time-consuming and to guarantee the coherence of provided analysis
results; this presents our first vision issue for which, we aim to provide an adequate solution. Another issue,
which is as important as the precedent one, focuses on modeling a continuous temporal evolution phenome-
non and therefore reducing inconsistent Online analytical processing queries results. Indeed, data returned
by queries can be the result of an evolution phenomenon continued in several time intervals. Therefore, we
nominate the versioning approach as a solution to keep traces of Data Warehouse / Data Mart schemas’
modifications. Solving these two issues presents the key of organization Decision support systems durability
and its material prosperity.
1 INTRODUCTION
As its data are often scattered and unstructured for
analysis, the operational information systems seem
inadequate for decision making. Toward this inade-
quacy, the
Data Warehouse (DW) technology has
emerged to collect and restructure data with the aim
to be the process of a good decision making. Due to
the continual evolving of decision makers’ needs
and the emergence of new business processes over
time, organization operational system will be affect-
ed and therefore must include new data. To analyze
these latter, their integration into the Decision Sup-
port Systems (DSS) becomes necessary. This, obvi-
ously, has a direct impact on DW. So, it must be
renewable and adaptable to all changes that may
occur; however, this is not sufficient to ensure the
coherence of query results. Indeed, the strong de-
pendence between the DSS components reveals the
necessity to propagate the changes performed on
DW towards its Data Marts (DMs). To achieve this
propagation, we define a set of “if-then” type rules
in order to identify the evolution operations that can
affect DMs following DW structure evolutions. In
the literature, works dealing with the DW evolution
problem can be classified into three different ap-
proaches namely schema evolution, schema version-
ing and View Maintenance. Our goal is to express
the improvement of DW schema over time. So, we
proposed a solution based on the versioning ap-
proach as it keeps traces of schemas changes
through time.
In this paper, we discussed two important issues:
the first one focuses on how to propagate DW sche-
ma evolution towards DM; the second issue express-
es how the versioning approach can be the best solu-
tion to guarantee consistency and coherence of
Online Analytical Processing (OLAP) queries re-
sults.
This position paper is organized as follows. In
Section 2, we provide an overview of related works
to the DW evolution. Section 3 presents motivations
and our position. Section 4 describes our proposed
evolution approach. In section 5, we express our
proposed approach efficiency by applying a set of
propagation rules illustrated by a medical case study.
Section 6 compares our proposed approach to anoth-
er existing one. We conclude the paper in section 7.
428
Azaiez N. and Akaichi J..
How to Guarantee Analysis Results Coherence after Data Warehouse Schema Changes Propagation towards Data Marts?.
DOI: 10.5220/0005158304280435
In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2014), pages 428-435
ISBN: 978-989-758-049-9
Copyright
c
2014 SCITEPRESS (Science and Technology Publications, Lda.)
2 STATE OF THE ART
Solving DW schema evolution problems is a compu-
tational challenge in the midst of the continuous
growth of technologies. Indeed, managing correctly
all type of changes, which affect an organization
DSS, can reflect the organization real world and
therefore, guarantee its durability.
The literature is interested in the evolution prob-
lem and proposes solutions leading to better deci-
sions making. These solutions are based on three
approaches: Schema evolution, schema versioning
and maintenance of materialized views.
2.1 Schema Evolution Approach
This approach is based on the assumption that the
DW schema has only one version, the current one.
Changes, which can affect DW schema, are translat-
ed into evolution operations updating the schema
structure and the associated instances.
Following the study of a sample of works related
to the DW schema evolution problem, we present a
comparative study based on a set of relevant criteria.
Table 1: Comparative study between existing works based
on schema evolution approach.
(Hurtado
et al.,
1999)
(Papastef-
anatos et
al., 2009)
(Taktak
et al.,
2012)
(Azaiez
et al.,
2013)
DW schema evolution
Dimensions
Hierarchies
Fact tables
Instances evolution
Materialized views
evolution
ETL evolution
Evolution impact
DW/DMs
Conformity to Meta-
model
Prototype
HECATA
-EUS
DWEv
Most of presented works treat different evolution
operations related to various DW components such
as dimensions and hierarchies. Some authors studied
the effect of evolution operations on instances and
materialized views (Hurtado et al., 1999), or even on
the process of ETL (Papastefanatos et al., 2009).
Others confirmed that the evolution problem can’t
be completely resolved only with the changes’ full
implementation on the meta-model level. That’s
why; they deepened their research to investigate
conformity of DW schema to its meta-model (Tak-
tak et al., 2012).
In general, the changes applied on DW schema
and on its DMs are manual; Taktak et al. (2012) and
Azaiez et al. (2013) resolved this gap. In fact, they
proposed approaches capable to identify the impact
of the DW evolutions on associated DMs.
However, the common drawback, that gathers all
these works, is the impossibility to model a continu-
ous temporal evolution phenomenon since the prin-
ciple of the schema evolution approach avoid to
keep the previous evolutions history.
2.2 Schema Versioning Approach
Unlike the principle of updating schema (schema
evolution), the temporal modeling, on which the
second approach is based, designed to keep traces of
different DW changes in several versions; this is
what is called schema versioning approach. Au-
thors’ works follow two different ways: either the
dimension members historization or the full DW
schema historization.
Table 2 summarizes some works classified ac-
cording to some identified criteria in the context of
schema versioning.
Table 2: Comparative study between existing works based
on schema versioning approach (Zouari et al., 2008).
DW complete schema
evolution operations
Dimension insertion/deletion
Level Insertion/deletion
Hierarchy insertion/deletion
Measure insertion/deletion
Fact insertion/deletion
Dimension members
evolution operations
Dimension members insertion/deletion
Update of low attribute of dimension
members
Update of dimension members key
Dimension members subdivision/Fusion
(Body et al.,
2002 ; 2003)
(Eder et al.,
2001)
After examining works presented recently, it
seems that some authors were interested only in the
evolution of dimension members, others were inter-
ested in operations affecting the full DW diagram.
Obviously, those latter were also interested in keep-
ing the history of dimension members since they
treat the full DW diagram evolution problem.
We note that authors cited above neglected stud-
ying the evolution of the most dynamic part in a
multidimensional schema; it is the Fact compound
updates. Besides, alterations, affecting DW schema
and its related DMs, are manual. Therefore, pro-
posed approaches are limited to study what it must
be evolved and neglect how to evolve it; this gap
HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?
429
deserves to be studied to guarantee a reliable organi-
zation DSS.
2.3 Materialized Views Maintenance
Approach
The third evolution approach, called maintenance of
materialized views, considers a DW as a set of mate-
rialized views constructed from data sources. This
approach focuses on maintaining materialized views
in response to data changes or to data sources
changes and even to oversee the DW quality under
schema evolution. Research works, related to view
maintenance, can be classified in two categories:
View adaptation: this approach consists in
adapting views to changes by including metadata
containing structural views updates.
View synchronization: this approach consists
in determining legal rewritings for affected views.
Table 3: Comparative study of works related to material-
ized views maintenance.
Gupta et
al., 1995
Bellahsene
2002
Akaichi et
al., 2008
Quix
2005
View adaptation
View
synchronisation
DWQ


The DW administrator can bring modifications
directly to views independently of data sources; that
is called view adaptation. Furthermore, data sources
can change their schema; this leads to lose of the
coherence of materialized views. In this case, pre-
serving the DW structural consistency becomes a
must; this is called structural view maintenance.
Through the classification presented in the table
above, we note that the View adaptation approach
and the View synchronization approach are the focus
points of the majority of the presented works. De-
spite the great changes’ impacts on DW quality
(DWQ), this gap was only treated in (Quix, 2005).
3 MOTIVATIONS AND
POSITION
The DW technology was developed to integrate
heterogeneous information sources for analysis pur-
poses. Therefore, a
DW is always renewable follow-
ing changes that may affect its structure. These
changes can be the translation of organizational
business processes progressing over time, the evolv-
ing needs of decision makers that lead to DW struc-
ture enrichment with additional analyses axes, or
even of removing of decision makers’ needs vague-
ness occurred during the DW design stage.
According to related works discussed in section
2, the classification of approaches depends on the
DW schema definition. In fact, the DW can be de-
fined as a multidimensional schema (i.e. star or
snowflake schema) or as a set of materialized views.
For the multidimensional modeling, DW evolution
approaches cover both of the schema evolution ap-
proach and schema versioning approach. However,
for the materialized views modeling, the DW evolu-
tion approach includes the view adaptation and syn-
chronization.
In general, Data Warehousing is extremely corre-
lated to multidimensionality. Practically, the concept
“Data warehouse schema” orients designers to think
about “multidimensional modelling” more than
“materialized modelling”. Therefore, for maintain-
ing DW after schema change it is required to choose
one of the two schema approaches: the schema evo-
lution approach and the schema versioning ap-
proach. We compared the two approaches and we
found that the schema versioning approach is more
adequate than schema evolution approach for DW
schema maintenance. Indeed, the schema evolution
approach consists on updating the old schema and
keeping only the last schema version; this leads to
lose data over time and consequently the impossibil-
ity to model a continuous temporal evolution phe-
nomenon. On the contrary, thanks to the functionali-
ties offered by the schema versioning approach, the
evolutions history of all schema versions are kept;
this may resolve the problem of queries which re-
sponses are returned over several time intervals.
Another advantage of the versioning approach is
manifested in predicting the impacts of future evolu-
tions on the organisation development. This predic-
tion can’t be correctly derived only if it is based on
the light of the previous decisions making.
Past Future
T1 T2
AV 1. 2AV 1. 1
AV 2. n
AV 2. 1
RV1 RV2
Now
Legend:
RVi
: Real version
AVi .j
: Alterna tive vers ion
: Version derivation relationship
: Version validity time interval
Figure 1: Versioning approach architecture (Oueslati et al.,
2011).
Figure 1 presents the versioning approach archi
KEOD2014-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment
430
tecture. This approach aims to give the birth of a set
of DW versions in time. Some are called real ver-
sions (RV), whose role is to reflect changes occurred
in the real world, and others are called alternative
versions (AV) which present virtual business scenar-
ios.
RVi is a DW real version with i {1…, n}, AVi.j
is a child of RVi; it’s an alternative version and
RVi
AVi.j means that AVi.j is a subset of RVi.
Whatever the kind of the version, this latter has
its own time validity. The valid time interval of RVi
is designed by VTi [tb(RVi), te(RVi)] as well as the
valid time interval of AVi.j is designed by VTi.j
[tb(AVi.j), te(AVi.j)].
This paper proposes an approach that ensures the
coherence and consistency of analysis results and
consequently the organization durability and materi-
al prosperity. This requires relying on an reliable
DSS that ensures the automation of evolution pro-
cess tasks, and historization of previous evolutions.
4 APPROACH OVERVIEW
Organization business process evolution leads to the
emergence of new data that must be analyzed for
decision making. Consequently, their integration
into the DSS becomes necessary in order to be ana-
lyzed. So, we propose to translate alterations into a
set of comprehensive evolution operations to be
applied on DW and therefore automatically propa-
gated towards the associated DMs. Besides, our
approach offers the possibility to keep traces of
previous occurred evolutions in several versions.
Initial DW schema
Initial DM schema
DW evolution
operations
Modified DW schema
Propagation
rules
Modified DM schema
DM evolution
operations
Mapping
Mapping
Input
Input
Output
Output
Initial
version
Derived
version
Evolution process
Figure 2: Proposed evolution approach architecture.
The overall proposed approach architecture is
described in figure 2.
The proposed approach evolution process is
composed of three steps:
Identifying operations that can affect an initial
DW schema to give the birth of several new DW
versions (modified DW schemas); they are the
DW derived versions.
Defining a set of “if_then” type rules in order
to identify updates affecting associated DM
schemas. These rules take as input the type of the
evolution operations (addition, deletion) applied
on the initial DW model, the affected elements
(table, column) and the various mappings
DW/DMs. As output, rules give back operations
which must be applied on DM schemas. This step
presents the core of the evolution process.
Applying the generated DM evolution opera-
tions on an initial DM schema gives the birth of
several new DM versions (modified DM sche-
mas); they are the DM derived versions.
Our approach advantage is that the horizontal
evolution, covering multiple born DW/DMs ver-
sions, offers the possibility to model a continuous
temporal evolution phenomenon since it keeps traces
of different changes affecting the organization DSS
overtime.
5 PROPAGATION RULES:
MEDICAL CASE STUDY
To illustrate the different cases of changes occurred
on DW schema and their impacts on related DMs,
we rely on a DW relational model. Figure 3 shows a
medical DW example constructed from tables which
are interconnected with constraints. Figure 4 is a
DM star schema called “Analyzing patient consulta-
tion” that we built from the medical DW of Figure 3.
Patient
Patcode
Patfirstname
Patlastname
Patgender
Patadress
Pattel
Patmaritalstatus
Patdatbirth
Consultation
Conscode
Patcode#
Doctorcode #
Consdate
Conscosts
Doctor
Doctorcode
Doctorname
Doctorgender
Doctordegree
Doctortel
Hiredate
Servicecode#
Service
Servicecode
Servicedesign
Figure 3: A medical DW example.
F_Consultation
Conscode
Consdate
Patcode#
Doctorcode #
Conscosts
D_Patient
Pat_cod
Pat_dat_birth
Pat_first_name
D_Date
Lib_Month
D_Doctor
Pat_last_name
Doctor_name
Lib_day
Pat_gender
Pat_marital_status
Pat_address
Pat_tel
Service_Design
Figure 4: “Analyzing patient consultation” DM built from
the DW of Figure 3.
To define rules that ensure the changes propaga-
tion, we use the following notations:
T
D
: A table T of DW which feeds a dimension D,
HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?
431
T
F
: A table T of DW which feeds a fact F,
T
H
: A table T of DW which feeds a hierarchy H,
T
id
: The identifier of a table T.
5.1 Addition Propagation Rules
In this section, we are inspired from our work in
(Azaiez et al., 2013) which is based on schema evo-
lution approach principles. However, the current
work takes another way that leads to more effective
analysis results. Indeed, we ameliorate the few rules
proposed in the previous work, define other propa-
gation rules that include evolution operations which
seem important and deserve to be studied, and we
illustrate all of them by a medical case study accord-
ing to the schema versioning approach principles.
In this section, we define the addition propaga-
tion rules for two cases: addition of a new table and
addition of a new column to the DW.
5.1.1 Table Addition
The addition of a new table T to DW can feed a new
fact F, a new dimension D or a new hierarchy H.
We define a rule for each case.
Rule R
AT
1. T addition can create a new fact F
If T references several tables loading different
dimensions (T
D1
, T
D2
...T
Dn
) of DM, if T is not refer-
enced by any table of the DW, if T contains one or
several additive column(s) and if the primary key of
T contains foreign key(s), then T can create a new
fact F. Consequently, the dimensions of F are T
D1
,
T
D2
...T
Dn
. For example, it is proposed to add the table
Hospitalisation (Hospital-code, Pat-code#, Arrival-
date, Release-date) to DW. Hospitalisation contains
one additive column which focuses on the nights
number and generated from the formula (Release
date - Arrival date). The table Hospitalisation refers
to the table Patient that feeds the D_Patient dimen-
sion and it is not referenced by any table of the DW.
According to the rule R
AT
1, the table Hospitalisation
can create a new fact conventionally called
F_Hospitalisation and consequently the emergence
of a new DM star schema related to the DW;
D_Patient is a dimension F_Hospitalisation.
Rule R
AT
2. T addition can create a new dimension
D
If T is referenced by T
F
that
feeds a fact F, if T
id
is atomic and if T contains columns that can be di-
mensional attributes (strong or weak), then T feeds a
new dimension D for F. Suppose that we add the
table Room (Room-code, Room-category, one-night-
price) to the DW by connecting it to the table Hospi-
talisation that feeds the fact F_Hospitalisation. As
the Room table contains an atomic identifier (Room-
code), a column that may become a parameter
(Room-category) and a column that can become
weak attribute (one-night-price), it satisfies the rule
R
AT
2. Therefore, it transforms into a new dimension
D_Room for the fact F_Hospitalisation.
Rule R
AT
3. T addition can create a new hierarchy
H
If T is referenced by T
D
which feeds a dimension
D, if T doesn’t refer any table, if T
id
is atomic and if
T doesn’t contain additive column(s), then T
completes the dimension D with a hierarchy H by
connecting D
id
to the attribute T
id
. Potential weak
attributes of T
id
parameter are the textual attributes
of T. For example, the addition of the table Disease
(Disease-code, Disease-design) which is referenced
by Patient (feeds the dimension D_Patient),
completes the dimension D_Patient with a new
hierarchy H_Disease (Pat-code, Disease-code). As
Disease-design is a textual attribute, it is considered
as a weak attribute of Disease-code.
We propose to express the principle of schema
versioning approach in the case of a new table T is
added to the DW model. We suppose that the basic
DW version is the model presented in figure 3; it’s
the RV1 of the DW model. We propose to add new
DW versions following the applying of R
AT
1, R
AT
2
and R
AT
3 rules. In general, the designer must choose
to create a new alternative version (AV) either fol-
lowing another AV or following a Real version (RV);
however, in the case of R
AT
1, relying on the first
solution is a must. Indeed, the R
AT
1 output is to cre-
ate a new fact table F_Hospitalisation following the
addition of the table Hospitalisation to the DW
schema; this requires the creation of a new AV1.1.
Then, we propose to apply the rule R
AT
2. The R
AT
2
output is to create a new dimension D_Room fol-
lowing the addition of the table Room to the DW
schema, and this new dimension can’t be approved
only if the table Room is related to a T
F
in DW mod-
el. That’s why, in order to reveal the evolution oc-
curred following the application of R
AT
2, we are
obliged to create another AV1.2 sequentially follow-
ing the AV1.1 which shows the evolution happened
after applying R
AT
1. The rule R
AT
3 can be applied
with different manners. Indeed, in this case, the
designer isn’t obliged to create a new AV sequential-
ly following the previous one; the new AV can fol-
low the RV since applying R
AT
3 gives the birth of a
new hierarchy whose related dimension have already
existed in the RV1. In order to express evolutions
KEOD2014-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment
432
occurred due to applying the three rules in the same
schema, we choose to give the birth a new AV1.3,
that contains the R
AT
3 applying results, sequentially
following the previous one. As consequence, figure
5 presents a part of AV1.3 containing AV1.2 enriched
with a new table Disease. Then, a new DM version
related to the AV1.3 is created (Figure 6). It is a star
schema which consists of the fact table
F_Hospitalisation referencing a set of dimensions
including the new dimension D_Room and the new
hierarchy H_Disease.
Patient
Patcode
Patfirstname
Patlastname
Patgender
Patadress
Pattel
Patmaritalstatus
Patdatbirth
Diseasecode#
Consultation
Conscode
Patcode#
Doctorcode #
Consdate
Conscosts
Do
c
Do
c
Do
c
Do
c
Do
c
Hir
e
Ser
v
Hospitalisation
Hospitalcode
Patcode#
Roomcode#
Arrivaldate
Releasedate
Room
Roomcode
Roomcategory
Onenightprice
Disease
Disease code
Diseasedesign
T feeds a new fact
F_Hospitalisation (R
AT
1)
T feeds a new dimension
D_Room(R
AT
2)
T feeds a new hierarchy
H_Disease (R
AT
3)
Figure 5: AV1.3: Addition of Hospitalisation, Room and
Disease tables to the DW.
F_Hospitalisation
Hospitalcode
Patcode#
Roomcode#
Arrivaldate
Releasedate
D_Patient
Pat_cod
Pat_dat_birth
Pat_first_n am e
D_Date
Lib_Mon th
D_Room
Pat_last_name
Lib_day
Pat_gender
Pat_marital_status
Pat_address
Pat_tel
New dimension D_Room
(R
A
T
2)
New fact F_Hospitalisation
(R
AT
1)
New hierarchy
H_Disease(R
AT
3)
One-night-price
Figure 6: New Star schema version: “Analyzing patient
hospitalization” built from AV1.3.
5.1.2 Column Addition
The addition of a column C to an existing DW can
enrich it with a new measure M, an existing Dimen-
sion with a new Attribute DA or a fact with a new
dimension D.
We define a rule for each case.
Rule R
AC
1. C addition can create a new measure
M
If the column C is additive and is added to a ta-
ble T
F
, then C presents a measure for F in the DM.
For example, it is proposed to add the column Total-
nights-costs to the table Hospitalisation that feeds
the fact F_Hospitalisation. As the column Total-
nights-costs satisfies the rule R
AC
1, it feeds a new
measure called Total-nights-costs for the fact
F_Hospitalisation.
Rule R
AC
2. C addition can create a new dimension
attribute DA
If C is added to a table T
D
and if C is non-
additive, then C is considered as an attribute for the
dimension D in the DM. The choice of the attribute
role (weak or strong) is decided by the designer. For
example, we suppose to add the non additive column
Room-loc to the table Room that feeds the dimen-
sion D_Room. The column Room-loc satisfies the
rule R
AC
2, so it feeds an attribute (strong) for the
dimension D_Room.
Rule R
AC
3. C addition can create a new dimension
D
If we add C of date type to T
F
, and if T
F
doesn’t
contain columns of date type, then C can feed a
temporal dimension in the related DM of F. For
example, we propose to add the column Date to the
table Consultation (Figure 3); however, it contains a
column of date type. Therefore, the addition of the
Date column does not have any effect neither on
DW nor on DM as it doesn’t satisfies the rule R
AC
3
conditions. Moreover, we propose to add the table
Laboratory_Exam (Exam-code, Pat-code#, Exam-
type, Exam-costs) to the DW. The table Exam satis-
fies the conditions of R
AT
1. This leads to the creation
of a new fact so called F_Laboratory_Exam and,
consequently, the emergence of a new DM called
Analyzing patient laboratory exam” related to the
DW model; D_Patient is a dimension connected to
F_Laboratory_Exam. This latter doesn’t contain
columns of date type. According to the rule R
AC
3, if
we add a column Exam-date to the table Laborato-
ry_Exam, then the column Exam-date enriches the
DM containing the fact F_Laboratory_Exam by
D_Date dimension.
5.2 Deletion Propagation Rules
In this section, we define the deletion propagation
rules for two cases: deletion of an existing table and
the deletion of an existing column from the DW.
5.2.1 Ta ble Delet ion
The deletion of an existing table T from DW can
lead to the elimination of an existing fact table F, an
existing dimension table D or even a hierarchy H.
We define a rule for each case.
HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?
433
Rule R
DT
1. T deletion can eliminate an existing
fact F
If the table T
F
is eliminated from DW, then the
fact table F will be automatically eliminated as well
as the DM containing F. This deletion doesn’t lead
to the deletion of all dimensions; indeed shared
dimensions will be maintained. In our example, if
we delete the Laboratory_Exam table, then
F_Laboratory_Exam will be eliminated as well as
the DM
Analyzing patient laboratory exam”.
D_Patient dimension which is a common dimension
for F_Laboratory_Exam and F_Hospitalisation,
will be maintained; any other specific dimensions,
such as D_Date, will be eliminated.
Rule R
DT
2. T deletion can eliminate an existing
dimension D
If the table T
D
is eliminated from DW, then the
dimension table D will be eliminated from all DMs
containing it. In our example, as the Patient table
feeds only one dimension, its elimination from DW
schema leads to the deletion of D_Patient from all
DMs.
Rule R
DT
3. T deletion can eliminate an existing
hierarchy H
If the table T
H
is eliminated from DW, then the
hierarchy H will be eliminated from the dimension
D which contains it. In our example, if we eliminate
the table Disease which enriches the dimension
D_Patient with the hierarchy H_Disease, then this
latter will be deleted from every DM contains the
dimension D_Patient.
5.2.2 Column Deletion
The deletion of an existing column C from a DW
table can lead to the elimination of an existing
pa-
rameter P, an existing hierarchy H,
an existing
measure M or even an existing dimension D.
We define a rule for each case.
Rule R
DC
1. C deletion can eliminate an existing
parameter P
If we delete a non additive column C which is
not a primary key from a table T
D
of DW schema,
then its corresponding element in DM will be delet-
ed from the dimension D which is fed by T
D
. Thus,
in our current example, the column Pat-marital-
status of the table Patient is not a primary key and it
is not an additive attribute. So, its deletion leads to
the elimination of the corresponding parameter Pat-
marital-status from the dimension D_Patient.
If the column C of a table T
D
supports a referen-
tial constraint toward another table T'
D
of the DW,
then C is a hierarchical level for D. Two cases can
arise depending on whether or not T'
D
refers to an-
other table. In the following rule, we
restrict our
study to the case that T'
D
doesn’t refer to another
table (end of hierarchy). In this context, we define
the R
DC
2 rule.
Rule R
DC
2. C deletion can eliminate an existing
hierarchy H
The removal of
C which satisfies conditions cit-
ed above, lead to the deletion of the hierarchy
TD
Id
TD’
Id
from DM(s) that contains it. As an
illustration, if we eliminate the Service-code column
from the table Doctor (Figure 3) which feeds the
D_Doctor dimension (Figure 4), then the corre-
sponding parameter of Service-code column which
presents a hierarchy level in D_Doctor as well as its
weak parameter Service-design, will be deleted.
Rule R
DC
3. C deletion can eliminate an existing
measure M
If a numeric additive column C is removed from
a table T
F
of the DW, then the measure M which
corresponds to C will be eliminated from F. In our
example, if the column Total-night-costs is deleted
from the table Hospitalisation, then the measure
called Total-night-costs will be deleted from
F_Hospitalisation.
Rule R
DC
4. C deletion can eliminate an existing
dimension D
If the column C supports a referential constraint
toward a table T
D
and if it is deleted from T
F
, then
the D dimension will be deleted from the DM which
contains F. In our example, the column Patient-code
presents a foreign key in the table Hospitalisation
which feeds the fact F_Hospitalisation in DM. So,
its removal leads to the elimination of the dimension
D_Patient from the DM called Analyzing patient
hospitalization”.
6 COMPARATIVE STUDY
Our approach is proposed to solve some gaps of
previous works. The idea is to exploit versioning
approach functionalities to keep traces of occurred
changes propagated from DW towards DMs; this
leads to coherent analysis results. Concerning evolu-
tion operations, we focus on applying propagation
rules for two cases: tables and column addition and
deletion to DW. On the contrary, in (Azaiez et al.,
KEOD2014-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment
434
2013), we investigated the problem of DW evolution
only in the case of tables and columns addition.
Besides, we chose the schema evolution approach as
the base of the work.
The comparative study express that the current
proposed approach offers coherent analysis results
unlike results given in (Azaiez et al., 2013). In fact,
in this latter, queries are unable to return data which
are the results of an evolution phenomenon contin-
ued in several time intervals, since the schema evo-
lution approach is based on the hypothesis that the
DW schema has only one version; it’s the current
one.
The following table compares of our proposed
approach versus the previous one:
Table 4: Comparative study.
(Azaiez et
al., 2013)
Our
approach
Evolution
approach
Schema Evolution
Schema Versioning
Evolution
operations
Addition
tables/columns
Deletion
tables/columns
7 CONCLUSION AND
PERSPECTIVE
In this paper, we presented an overview on the DW
evolution problems. Indeed, we exposed some solu-
tions proposed by different authors in recent years.
To overcome the problem related of the DW schema
changes and their impacts on DMs, we proposed an
approach which deals with the propagation problem
of DW changes on its DMs; this approach is based
on "if-then" type rules. However, this is not enough
to ensure the analysis results coherence and con-
sistency. Therefore, we relied on the schema ver-
sioning approach to keep trace of evolutions affect-
ing DW model and their impacts on related DMs.
This paper is limited at studying the evolution
modeling of classic DWs that includes data which
concerned only fixed objects, and neglected moving
objects activities that generate a new data type so
called “trajectory data”; those latter are stored in a
mobile data central repository that called Trajectory
Data Warehouse (TDW). As perspective, we pro-
pose to deal with the TDW evolution problems tak-
ing into account its new data type and structure
changes.
REFERENCES
Akaichi, J., Oueslati, W., 2008. MAVIE: A Mobile Agents
View synchronization system. In first international
conference on the applications of digital information
and web technology (pp. 145-150).Ostravem.
Azaiez, N., Taktak, S., Feki, J., 2013. DWEV : Un proto-
type pour l'évolution partielle du schéma multidimen-
sionnel. In 7éme édition de la Conférence Maghrébine
sur les Avancées des systèmes décisionnels (ASD),
Marrakech, Maroc.
Bellahsene, Z., 2002. Schema Evolution in Data Ware-
houses. Journal of Knowledge and Information Sys-
tems, 4 (3) (pp. 283-304).
Body, M., Miquel M., Bédard, Y., Tchounikine, A., 2003.
Handling Evolutions in Multidimensional Structures.
In IEEE 19
th
International Conference on Data Engi-
neering (ICDE) (pp. 581-591). Bangalore, India.
Body, M., Miquel, M., Bédard, Y., Tchounikine, A., 2002.
A multidimensional and multiversion structure for
OLAP applications. In Proceedings of the 5
th
ACM In-
ternational Workshop on Data Warehousing and
OLAP (pp. 1-6). McLean, Virginia, USA.
Eder, J., Koncilia, C., 2001. Changes of Dimension Data
in Temporal Data Warehouses. In Proceedings of the
DaWaK’01 Conference, (pp. 284-293). Munich, Ger-
many.
Gupta, A., Mumick, I., Ross, K., 1995. Adapting Material-
ized Views after redefinitions. SIGMOD 95, (pp. 211-
222).
Hurtado, C. A., Mendelzon, A. O., Vaisman, A. A., 1999.
Maintaining Data Cubes under Dimension Updates. In
XVth International Conference on Data Engineering
(ICDE 1999), IEEE Computer Society, (pp.346–
355).Sydney.
Papastefanatos, G., Vassiliadis, PP., Simitsis, A., Sellis,
T., Vassiliou, Y., 2009. Rulebased Management of
Schema Changes at ETL Sources. In The International
Workshop on Managing Evolution of Data Ware-
houses (MEDWa), Riga, Latvia.
Quix, C., 2004. Repository Support for Data Warehouse
Evolution. In Proceedings of the International Work-
shop DMDW, Heidelberg, Germany.
Taktak, S., Feki, J., 2012. Toward Propagating the Evolu-
tion of Data Warehouse on Data Marts. In: MEDI
2012. Lecture Notes in Computer Science, Vol. 7602,
Springer Verlag, Berlin Heidelberg (pp. 178–185).
Poitiers, France.
Zouari, I., Ghozzi, F., Bouaziz, R., 2008. Impact de
l’évolution de nomenclature sur le versionnement des
entrepôts de données. Ingénierie des Systèmes
d'Information, volume 13, (pp. 85-114).
Oueslati, W., Akaichi, J., 2011. A Multiversion Trajectory
Data Warehouse to Handle Structure Changes, Inter-
national Journal of Database Theory and Application,
Vol. 4, No. 2. (pp. 35-50).
HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?
435