How to Guarantee Analysis Results Coherence after Data Warehouse

Schema Changes Propagation towards Data Marts?

Noura Azaiez and Jalel Akaichi

Department of Computer Science, ISG-University of Tunis, Le Bardo, Tunisia

Keywords: Data Warehouse, Evolution Operations, Data Mart, Propagation Rules, Schema Versioning.

Abstract. Data Warehouse, accompanied with Online analytical processing, is considered as the core of the modern

Decision support systems. The emergence of new analytical requirements and changes in organization busi-

ness processes push the underlying information sources, destined to feed the data warehouse, to modify not

only their data, but also their structure. This, obviously, has a direct impact on Data Warehouse and its asso-

ciated Data Marts. Maintaining Data Warehouse structure becomes, therefore, a must; however, it is not suf-

ficient. In fact, evolutions performed on the Data Warehouse schema have to be propagated on the related

Data Marts in order to minimize costs, time-consuming and to guarantee the coherence of provided analysis

results; this presents our first vision issue for which, we aim to provide an adequate solution. Another issue,

which is as important as the precedent one, focuses on modeling a continuous temporal evolution phenome-

non and therefore reducing inconsistent Online analytical processing queries results. Indeed, data returned

by queries can be the result of an evolution phenomenon continued in several time intervals. Therefore, we

nominate the versioning approach as a solution to keep traces of Data Warehouse / Data Mart schemas’

modifications. Solving these two issues presents the key of organization Decision support systems durability

and its material prosperity.

1 INTRODUCTION

As its data are often scattered and unstructured for

analysis, the operational information systems seem

inadequate for decision making. Toward this inade-

quacy, the

Data Warehouse (DW) technology has

emerged to collect and restructure data with the aim

to be the process of a good decision making. Due to

the continual evolving of decision makers’ needs

and the emergence of new business processes over

time, organization operational system will be affect-

ed and therefore must include new data. To analyze

these latter, their integration into the Decision Sup-

port Systems (DSS) becomes necessary. This, obvi-

ously, has a direct impact on DW. So, it must be

renewable and adaptable to all changes that may

occur; however, this is not sufficient to ensure the

coherence of query results. Indeed, the strong de-

pendence between the DSS components reveals the

necessity to propagate the changes performed on

DW towards its Data Marts (DMs). To achieve this

propagation, we define a set of “if-then” type rules

in order to identify the evolution operations that can

affect DMs following DW structure evolutions. In

the literature, works dealing with the DW evolution

problem can be classified into three different ap-

proaches namely schema evolution, schema version-

ing and View Maintenance. Our goal is to express

the improvement of DW schema over time. So, we

proposed a solution based on the versioning ap-

proach as it keeps traces of schemas changes

through time.

In this paper, we discussed two important issues:

the first one focuses on how to propagate DW sche-

ma evolution towards DM; the second issue express-

es how the versioning approach can be the best solu-

tion to guarantee consistency and coherence of

Online Analytical Processing (OLAP) queries re-

sults.

This position paper is organized as follows. In

Section 2, we provide an overview of related works

to the DW evolution. Section 3 presents motivations

and our position. Section 4 describes our proposed

evolution approach. In section 5, we express our

proposed approach efficiency by applying a set of

propagation rules illustrated by a medical case study.

Section 6 compares our proposed approach to anoth-

er existing one. We conclude the paper in section 7.

428

Azaiez N. and Akaichi J..

How to Guarantee Analysis Results Coherence after Data Warehouse Schema Changes Propagation towards Data Marts?.

DOI: 10.5220/0005158304280435

In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2014), pages 428-435

ISBN: 978-989-758-049-9

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

2 STATE OF THE ART

Solving DW schema evolution problems is a compu-

tational challenge in the midst of the continuous

growth of technologies. Indeed, managing correctly

all type of changes, which affect an organization

DSS, can reflect the organization real world and

therefore, guarantee its durability.

The literature is interested in the evolution prob-

lem and proposes solutions leading to better deci-

sions making. These solutions are based on three

approaches: Schema evolution, schema versioning

and maintenance of materialized views.

2.1 Schema Evolution Approach

This approach is based on the assumption that the

DW schema has only one version, the current one.

Changes, which can affect DW schema, are translat-

ed into evolution operations updating the schema

structure and the associated instances.

Following the study of a sample of works related

to the DW schema evolution problem, we present a

comparative study based on a set of relevant criteria.

Table 1: Comparative study between existing works based

on schema evolution approach.

(Hurtado

et al.,

1999)

(Papastef-

anatos et

al., 2009)

(Taktak

et al.,

2012)

(Azaiez

et al.,

2013)

DW schema evolution

Dimensions

Hierarchies

Fact tables

Instances evolution

Materialized views

evolution

ETL evolution

Evolution impact

DW/DMs

Conformity to Meta-

model

Prototype

HECATA

-EUS

DWEv



Most of presented works treat different evolution

operations related to various DW components such

as dimensions and hierarchies. Some authors studied

the effect of evolution operations on instances and

materialized views (Hurtado et al., 1999), or even on

the process of ETL (Papastefanatos et al., 2009).

Others confirmed that the evolution problem can’t

be completely resolved only with the changes’ full

implementation on the meta-model level. That’s

why; they deepened their research to investigate

conformity of DW schema to its meta-model (Tak-

tak et al., 2012).

In general, the changes applied on DW schema

and on its DMs are manual; Taktak et al. (2012) and

Azaiez et al. (2013) resolved this gap. In fact, they

proposed approaches capable to identify the impact

of the DW evolutions on associated DMs.

However, the common drawback, that gathers all

these works, is the impossibility to model a continu-

ous temporal evolution phenomenon since the prin-

ciple of the schema evolution approach avoid to

keep the previous evolutions history.

2.2 Schema Versioning Approach

Unlike the principle of updating schema (schema

evolution), the temporal modeling, on which the

second approach is based, designed to keep traces of

different DW changes in several versions; this is

what is called schema versioning approach. Au-

thors’ works follow two different ways: either the

dimension members historization or the full DW

schema historization.

Table 2 summarizes some works classified ac-

cording to some identified criteria in the context of

schema versioning.

Table 2: Comparative study between existing works based

on schema versioning approach (Zouari et al., 2008).

DW complete schema

evolution operations

Dimension insertion/deletion

Level Insertion/deletion

Hierarchy insertion/deletion

Measure insertion/deletion

Fact insertion/deletion

Dimension members

evolution operations

Dimension members insertion/deletion

Update of low attribute of dimension

members

Update of dimension members key

Dimension members subdivision/Fusion

(Body et al.,

2002 ; 2003)



(Eder et al.,

2001)



After examining works presented recently, it

seems that some authors were interested only in the

evolution of dimension members, others were inter-

ested in operations affecting the full DW diagram.

Obviously, those latter were also interested in keep-

ing the history of dimension members since they

treat the full DW diagram evolution problem.

We note that authors cited above neglected stud-

ying the evolution of the most dynamic part in a

multidimensional schema; it is the Fact compound

updates. Besides, alterations, affecting DW schema

and its related DMs, are manual. Therefore, pro-

posed approaches are limited to study what it must

be evolved and neglect how to evolve it; this gap

HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?

429

deserves to be studied to guarantee a reliable organi-

zation DSS.

2.3 Materialized Views Maintenance

Approach

The third evolution approach, called maintenance of

materialized views, considers a DW as a set of mate-

rialized views constructed from data sources. This

approach focuses on maintaining materialized views

in response to data changes or to data sources

changes and even to oversee the DW quality under

schema evolution. Research works, related to view

maintenance, can be classified in two categories:

 View adaptation: this approach consists in

adapting views to changes by including metadata

containing structural views updates.

 View synchronization: this approach consists

in determining legal rewritings for affected views.

Table 3: Comparative study of works related to material-

ized views maintenance.

Gupta et

al., 1995

Bellahsene

2002

Akaichi et

al., 2008

Quix

2005

View adaptation

View

synchronisation

DWQ





The DW administrator can bring modifications

directly to views independently of data sources; that

is called view adaptation. Furthermore, data sources

can change their schema; this leads to lose of the

coherence of materialized views. In this case, pre-

serving the DW structural consistency becomes a

must; this is called structural view maintenance.

Through the classification presented in the table

above, we note that the View adaptation approach

and the View synchronization approach are the focus

points of the majority of the presented works. De-

spite the great changes’ impacts on DW quality

(DWQ), this gap was only treated in (Quix, 2005).

3 MOTIVATIONS AND

POSITION

The DW technology was developed to integrate

heterogeneous information sources for analysis pur-

poses. Therefore, a

DW is always renewable follow-

ing changes that may affect its structure. These

changes can be the translation of organizational

business processes progressing over time, the evolv-

ing needs of decision makers that lead to DW struc-

ture enrichment with additional analyses axes, or

even of removing of decision makers’ needs vague-

ness occurred during the DW design stage.

According to related works discussed in section

2, the classification of approaches depends on the

DW schema definition. In fact, the DW can be de-

fined as a multidimensional schema (i.e. star or

snowflake schema) or as a set of materialized views.

For the multidimensional modeling, DW evolution

approaches cover both of the schema evolution ap-

proach and schema versioning approach. However,

for the materialized views modeling, the DW evolu-

tion approach includes the view adaptation and syn-

chronization.

In general, Data Warehousing is extremely corre-

lated to multidimensionality. Practically, the concept

“Data warehouse schema” orients designers to think

about “multidimensional modelling” more than

“materialized modelling”. Therefore, for maintain-

ing DW after schema change it is required to choose

one of the two schema approaches: the schema evo-

lution approach and the schema versioning ap-

proach. We compared the two approaches and we

found that the schema versioning approach is more

adequate than schema evolution approach for DW

schema maintenance. Indeed, the schema evolution

approach consists on updating the old schema and

keeping only the last schema version; this leads to

lose data over time and consequently the impossibil-

ity to model a continuous temporal evolution phe-

nomenon. On the contrary, thanks to the functionali-

ties offered by the schema versioning approach, the

evolutions history of all schema versions are kept;

this may resolve the problem of queries which re-

sponses are returned over several time intervals.

Another advantage of the versioning approach is

manifested in predicting the impacts of future evolu-

tions on the organisation development. This predic-

tion can’t be correctly derived only if it is based on

the light of the previous decisions making.

Past Future

T1 T2

AV 1. 2AV 1. 1

AV 2. n

AV 2. 1

RV1 RV2

Now

Legend:

RVi

: Real version

AVi .j

: Alterna tive vers ion

: Version derivation relationship

: Version validity time interval

Figure 1: Versioning approach architecture (Oueslati et al.,

2011).

Figure 1 presents the versioning approach archi

KEOD2014-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

430

tecture. This approach aims to give the birth of a set

of DW versions in time. Some are called real ver-

sions (RV), whose role is to reflect changes occurred

in the real world, and others are called alternative

versions (AV) which present virtual business scenar-

ios.

RVi is a DW real version with i ∈ {1…, n}, AVi.j

is a child of RVi; it’s an alternative version and

RVi



AVi.j means that AVi.j is a subset of RVi.

Whatever the kind of the version, this latter has

its own time validity. The valid time interval of RVi

is designed by VTi [tb(RVi), te(RVi)] as well as the

valid time interval of AVi.j is designed by VTi.j

[tb(AVi.j), te(AVi.j)].

This paper proposes an approach that ensures the

coherence and consistency of analysis results and

consequently the organization durability and materi-

al prosperity. This requires relying on an reliable

DSS that ensures the automation of evolution pro-

cess tasks, and historization of previous evolutions.

4 APPROACH OVERVIEW

Organization business process evolution leads to the

emergence of new data that must be analyzed for

decision making. Consequently, their integration

into the DSS becomes necessary in order to be ana-

lyzed. So, we propose to translate alterations into a

set of comprehensive evolution operations to be

applied on DW and therefore automatically propa-

gated towards the associated DMs. Besides, our

approach offers the possibility to keep traces of

previous occurred evolutions in several versions.

Initial DW schema

Initial DM schema

DW evolution

operations

Modified DW schema

Propagation

rules

Modified DM schema

DM evolution

operations

Mapping

Input

Output

Initial

version

Derived

version

Evolution process

Figure 2: Proposed evolution approach architecture.

The overall proposed approach architecture is

described in figure 2.

The proposed approach evolution process is

composed of three steps:

 Identifying operations that can affect an initial

DW schema to give the birth of several new DW

versions (modified DW schemas); they are the

DW derived versions.

 Defining a set of “if_then” type rules in order

to identify updates affecting associated DM

schemas. These rules take as input the type of the

evolution operations (addition, deletion) applied

on the initial DW model, the affected elements

(table, column) and the various mappings

DW/DMs. As output, rules give back operations

which must be applied on DM schemas. This step

presents the core of the evolution process.

 Applying the generated DM evolution opera-

tions on an initial DM schema gives the birth of

several new DM versions (modified DM sche-

mas); they are the DM derived versions.

Our approach advantage is that the horizontal

evolution, covering multiple born DW/DMs ver-

sions, offers the possibility to model a continuous

temporal evolution phenomenon since it keeps traces

of different changes affecting the organization DSS

overtime.

5 PROPAGATION RULES:

MEDICAL CASE STUDY

To illustrate the different cases of changes occurred

on DW schema and their impacts on related DMs,

we rely on a DW relational model. Figure 3 shows a

medical DW example constructed from tables which

are interconnected with constraints. Figure 4 is a

DM star schema called “Analyzing patient consulta-

tion” that we built from the medical DW of Figure 3.

Patient

Pat‐code

Pat‐first‐name

Pat‐last‐name

Pat‐gender

Pat‐adress

Pat‐tel

Pat‐marital‐status

Pat‐dat‐birth

Consultation

Cons‐code

Pat‐code#

Doctor‐code #

Cons‐date

Cons‐costs

Doctor

Doctor‐code

Doctor‐name

Doctor‐gender

Doctor‐degree

Doctor‐tel

Hire‐date

Service‐code#

Service

Service‐code

Service‐design

Figure 3: A medical DW example.

F_Consultation

Cons‐code

Cons‐date

Pat‐code#

Doctor‐code #

Cons‐costs

D_Patient

Pat_cod

Pat_dat_birth

Pat_first_name

D_Date

Lib_Month

D_Doctor

Pat_last_name

Doctor_name

Lib_day

Pat_gender

Pat_marital_status

Pat_address

Pat_tel

Service_Design

Figure 4: “Analyzing patient consultation” DM built from

the DW of Figure 3.

To define rules that ensure the changes propaga-

tion, we use the following notations:

 T

: A table T of DW which feeds a dimension D,

HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?

431

 T

: A table T of DW which feeds a fact F,

 T

: A table T of DW which feeds a hierarchy H,

 T

: The identifier of a table T.

5.1 Addition Propagation Rules

In this section, we are inspired from our work in

(Azaiez et al., 2013) which is based on schema evo-

lution approach principles. However, the current

work takes another way that leads to more effective

analysis results. Indeed, we ameliorate the few rules

proposed in the previous work, define other propa-

gation rules that include evolution operations which

seem important and deserve to be studied, and we

illustrate all of them by a medical case study accord-

ing to the schema versioning approach principles.

In this section, we define the addition propaga-

tion rules for two cases: addition of a new table and

addition of a new column to the DW.

5.1.1 Table Addition

The addition of a new table T to DW can feed a new

fact F, a new dimension D or a new hierarchy H.

We define a rule for each case.

 Rule R

1. T addition can create a new fact F

If T references several tables loading different

dimensions (T

, T

...T

) of DM, if T is not refer-

enced by any table of the DW, if T contains one or

several additive column(s) and if the primary key of

T contains foreign key(s), then T can create a new

fact F. Consequently, the dimensions of F are T

...T

. For example, it is proposed to add the table

Hospitalisation (Hospital-code, Pat-code#, Arrival-

date, Release-date) to DW. Hospitalisation contains

one additive column which focuses on the nights

number and generated from the formula (Release

date - Arrival date). The table Hospitalisation refers

to the table Patient that feeds the D_Patient dimen-

sion and it is not referenced by any table of the DW.

According to the rule R

1, the table Hospitalisation

can create a new fact conventionally called

F_Hospitalisation and consequently the emergence

of a new DM star schema related to the DW;

D_Patient is a dimension F_Hospitalisation.

 Rule R

2. T addition can create a new dimension

If T is referenced by T

that

feeds a fact F, if T

is atomic and if T contains columns that can be di-

mensional attributes (strong or weak), then T feeds a

new dimension D for F. Suppose that we add the

table Room (Room-code, Room-category, one-night-

price) to the DW by connecting it to the table Hospi-

talisation that feeds the fact F_Hospitalisation. As

the Room table contains an atomic identifier (Room-

code), a column that may become a parameter

(Room-category) and a column that can become

weak attribute (one-night-price), it satisfies the rule

2. Therefore, it transforms into a new dimension

D_Room for the fact F_Hospitalisation.

 Rule R

3. T addition can create a new hierarchy

If T is referenced by T

which feeds a dimension

D, if T doesn’t refer any table, if T

is atomic and if

T doesn’t contain additive column(s), then T

completes the dimension D with a hierarchy H by

connecting D

to the attribute T

. Potential weak

attributes of T

parameter are the textual attributes

of T. For example, the addition of the table Disease

(Disease-code, Disease-design) which is referenced

by Patient (feeds the dimension D_Patient),

completes the dimension D_Patient with a new

hierarchy H_Disease (Pat-code, Disease-code). As

Disease-design is a textual attribute, it is considered

as a weak attribute of Disease-code.

We propose to express the principle of schema

versioning approach in the case of a new table T is

added to the DW model. We suppose that the basic

DW version is the model presented in figure 3; it’s

the RV1 of the DW model. We propose to add new

DW versions following the applying of R

1, R

and R

3 rules. In general, the designer must choose

to create a new alternative version (AV) either fol-

lowing another AV or following a Real version (RV);

however, in the case of R

1, relying on the first

solution is a must. Indeed, the R

1 output is to cre-

ate a new fact table F_Hospitalisation following the

addition of the table Hospitalisation to the DW

schema; this requires the creation of a new AV1.1.

Then, we propose to apply the rule R

2. The R

output is to create a new dimension D_Room fol-

lowing the addition of the table Room to the DW

schema, and this new dimension can’t be approved

only if the table Room is related to a T

in DW mod-

el. That’s why, in order to reveal the evolution oc-

curred following the application of R

2, we are

obliged to create another AV1.2 sequentially follow-

ing the AV1.1 which shows the evolution happened

after applying R

1. The rule R

3 can be applied

with different manners. Indeed, in this case, the

designer isn’t obliged to create a new AV sequential-

ly following the previous one; the new AV can fol-

low the RV since applying R

3 gives the birth of a

new hierarchy whose related dimension have already

existed in the RV1. In order to express evolutions

KEOD2014-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

432

occurred due to applying the three rules in the same

schema, we choose to give the birth a new AV1.3,

that contains the R

3 applying results, sequentially

following the previous one. As consequence, figure

5 presents a part of AV1.3 containing AV1.2 enriched

with a new table Disease. Then, a new DM version

related to the AV1.3 is created (Figure 6). It is a star

schema which consists of the fact table

F_Hospitalisation referencing a set of dimensions

including the new dimension D_Room and the new

hierarchy H_Disease.

Patient

Pat‐code

Pat‐first‐name

Pat‐last‐name

Pat‐gender

Pat‐adress

Pat‐tel

Pat‐marital‐status

Pat‐dat‐birth

Disease‐code#

Consultation

Cons‐code

Pat‐code#

Doctor‐code #

Cons‐date

Cons‐costs

Hir

Ser

Hospitalisation

Hospital‐code

Pat‐code#

Room‐code#

Arrival‐date

Release‐date

Room

Room‐code

Room‐category

One‐night‐price

Disease

Disease ‐code

Disease‐design

T feeds a new fact

F_Hospitalisation (R

T feeds a new dimension

D_Room(R

T feeds a new hierarchy

H_Disease (R

Figure 5: AV1.3: Addition of Hospitalisation, Room and

Disease tables to the DW.

F_Hospitalisation

Hospital‐code

Pat‐code#

Room‐code#

Arrival‐date

Release‐date

D_Patient

Pat_cod

Pat_dat_birth

Pat_first_n am e

D_Date

Lib_Mon th

D_Room

Pat_last_name

Lib_day

Pat_gender

Pat_marital_status

Pat_address

Pat_tel

New dimension D_Room

New fact F_Hospitalisation

New hierarchy

H_Disease(R

One-night-price

Figure 6: New Star schema version: “Analyzing patient

hospitalization” built from AV1.3.

5.1.2 Column Addition

The addition of a column C to an existing DW can

enrich it with a new measure M, an existing Dimen-

sion with a new Attribute DA or a fact with a new

dimension D.

We define a rule for each case.

 Rule R

1. C addition can create a new measure

If the column C is additive and is added to a ta-

ble T

, then C presents a measure for F in the DM.

For example, it is proposed to add the column Total-

nights-costs to the table Hospitalisation that feeds

the fact F_Hospitalisation. As the column Total-

nights-costs satisfies the rule R

1, it feeds a new

measure called Total-nights-costs for the fact

F_Hospitalisation.

 Rule R

2. C addition can create a new dimension

attribute DA

If C is added to a table T

and if C is non-

additive, then C is considered as an attribute for the

dimension D in the DM. The choice of the attribute

role (weak or strong) is decided by the designer. For

example, we suppose to add the non additive column

Room-loc to the table Room that feeds the dimen-

sion D_Room. The column Room-loc satisfies the

rule R

2, so it feeds an attribute (strong) for the

dimension D_Room.

 Rule R

3. C addition can create a new dimension

If we add C of date type to T

, and if T

doesn’t

contain columns of date type, then C can feed a

temporal dimension in the related DM of F. For

example, we propose to add the column Date to the

table Consultation (Figure 3); however, it contains a

column of date type. Therefore, the addition of the

Date column does not have any effect neither on

DW nor on DM as it doesn’t satisfies the rule R

conditions. Moreover, we propose to add the table

Laboratory_Exam (Exam-code, Pat-code#, Exam-

type, Exam-costs) to the DW. The table Exam satis-

fies the conditions of R

1. This leads to the creation

of a new fact so called F_Laboratory_Exam and,

consequently, the emergence of a new DM called

“Analyzing patient laboratory exam” related to the

DW model; D_Patient is a dimension connected to

F_Laboratory_Exam. This latter doesn’t contain

columns of date type. According to the rule R

3, if

we add a column Exam-date to the table Laborato-

ry_Exam, then the column Exam-date enriches the

DM containing the fact F_Laboratory_Exam by

D_Date dimension.

5.2 Deletion Propagation Rules

In this section, we define the deletion propagation

rules for two cases: deletion of an existing table and

the deletion of an existing column from the DW.

5.2.1 Ta ble Delet ion

The deletion of an existing table T from DW can

lead to the elimination of an existing fact table F, an

existing dimension table D or even a hierarchy H.

We define a rule for each case.

HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?

433

 Rule R

1. T deletion can eliminate an existing

fact F

If the table T

is eliminated from DW, then the

fact table F will be automatically eliminated as well

as the DM containing F. This deletion doesn’t lead

to the deletion of all dimensions; indeed shared

dimensions will be maintained. In our example, if

we delete the Laboratory_Exam table, then

F_Laboratory_Exam will be eliminated as well as

the DM

“Analyzing patient laboratory exam”.

D_Patient dimension which is a common dimension

for F_Laboratory_Exam and F_Hospitalisation,

will be maintained; any other specific dimensions,

such as D_Date, will be eliminated.

 Rule R

2. T deletion can eliminate an existing

dimension D

If the table T

is eliminated from DW, then the

dimension table D will be eliminated from all DMs

containing it. In our example, as the Patient table

feeds only one dimension, its elimination from DW

schema leads to the deletion of D_Patient from all

DMs.

 Rule R

3. T deletion can eliminate an existing

hierarchy H

If the table T

is eliminated from DW, then the

hierarchy H will be eliminated from the dimension

D which contains it. In our example, if we eliminate

the table Disease which enriches the dimension

D_Patient with the hierarchy H_Disease, then this

latter will be deleted from every DM contains the

dimension D_Patient.

5.2.2 Column Deletion

The deletion of an existing column C from a DW

table can lead to the elimination of an existing

pa-

rameter P, an existing hierarchy H,

an existing

measure M or even an existing dimension D.

We define a rule for each case.

 Rule R

1. C deletion can eliminate an existing

parameter P

If we delete a non additive column C which is

not a primary key from a table T

of DW schema,

then its corresponding element in DM will be delet-

ed from the dimension D which is fed by T

. Thus,

in our current example, the column Pat-marital-

status of the table Patient is not a primary key and it

is not an additive attribute. So, its deletion leads to

the elimination of the corresponding parameter Pat-

marital-status from the dimension D_Patient.

If the column C of a table T

supports a referen-

tial constraint toward another table T'

of the DW,

then C is a hierarchical level for D. Two cases can

arise depending on whether or not T'

refers to an-

other table. In the following rule, we

restrict our

study to the case that T'

doesn’t refer to another

table (end of hierarchy). In this context, we define

the R

2 rule.

 Rule R

2. C deletion can eliminate an existing

hierarchy H

The removal of

C which satisfies conditions cit-

ed above, lead to the deletion of the hierarchy

TD’

from DM(s) that contains it. As an

illustration, if we eliminate the Service-code column

from the table Doctor (Figure 3) which feeds the

D_Doctor dimension (Figure 4), then the corre-

sponding parameter of Service-code column which

presents a hierarchy level in D_Doctor as well as its

weak parameter Service-design, will be deleted.

 Rule R

3. C deletion can eliminate an existing

measure M

If a numeric additive column C is removed from

a table T

of the DW, then the measure M which

corresponds to C will be eliminated from F. In our

example, if the column Total-night-costs is deleted

from the table Hospitalisation, then the measure

called Total-night-costs will be deleted from

F_Hospitalisation.

 Rule R

4. C deletion can eliminate an existing

dimension D

If the column C supports a referential constraint

toward a table T

and if it is deleted from T

, then

the D dimension will be deleted from the DM which

contains F. In our example, the column Patient-code

presents a foreign key in the table Hospitalisation

which feeds the fact F_Hospitalisation in DM. So,

its removal leads to the elimination of the dimension

D_Patient from the DM called “Analyzing patient

hospitalization”.

6 COMPARATIVE STUDY

Our approach is proposed to solve some gaps of

previous works. The idea is to exploit versioning

approach functionalities to keep traces of occurred

changes propagated from DW towards DMs; this

leads to coherent analysis results. Concerning evolu-

tion operations, we focus on applying propagation

rules for two cases: tables and column addition and

deletion to DW. On the contrary, in (Azaiez et al.,

KEOD2014-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

434

2013), we investigated the problem of DW evolution

only in the case of tables and columns addition.

Besides, we chose the schema evolution approach as

the base of the work.

The comparative study express that the current

proposed approach offers coherent analysis results

unlike results given in (Azaiez et al., 2013). In fact,

in this latter, queries are unable to return data which

are the results of an evolution phenomenon contin-

ued in several time intervals, since the schema evo-

lution approach is based on the hypothesis that the

DW schema has only one version; it’s the current

one.

The following table compares of our proposed

approach versus the previous one:

Table 4: Comparative study.

(Azaiez et

al., 2013)

Our

approach

Evolution

approach

Schema Evolution

Schema Versioning

Evolution

operations

Addition

tables/columns

Deletion

tables/columns



7 CONCLUSION AND

PERSPECTIVE

In this paper, we presented an overview on the DW

evolution problems. Indeed, we exposed some solu-

tions proposed by different authors in recent years.

To overcome the problem related of the DW schema

changes and their impacts on DMs, we proposed an

approach which deals with the propagation problem

of DW changes on its DMs; this approach is based

on "if-then" type rules. However, this is not enough

to ensure the analysis results coherence and con-

sistency. Therefore, we relied on the schema ver-

sioning approach to keep trace of evolutions affect-

ing DW model and their impacts on related DMs.

This paper is limited at studying the evolution

modeling of classic DWs that includes data which

concerned only fixed objects, and neglected moving

objects activities that generate a new data type so

called “trajectory data”; those latter are stored in a

mobile data central repository that called Trajectory

Data Warehouse (TDW). As perspective, we pro-

pose to deal with the TDW evolution problems tak-

ing into account its new data type and structure

changes.

REFERENCES

Akaichi, J., Oueslati, W., 2008. MAVIE: A Mobile Agents

View synchronization system. In first international

conference on the applications of digital information

and web technology (pp. 145-150).Ostravem.

Azaiez, N., Taktak, S., Feki, J., 2013. DWEV : Un proto-

type pour l'évolution partielle du schéma multidimen-

sionnel. In 7éme édition de la Conférence Maghrébine

sur les Avancées des systèmes décisionnels (ASD),

Marrakech, Maroc.

Bellahsene, Z., 2002. Schema Evolution in Data Ware-

houses. Journal of Knowledge and Information Sys-

tems, 4 (3) (pp. 283-304).

Body, M., Miquel M., Bédard, Y., Tchounikine, A., 2003.

Handling Evolutions in Multidimensional Structures.

In IEEE 19

International Conference on Data Engi-

neering (ICDE) (pp. 581-591). Bangalore, India.

Body, M., Miquel, M., Bédard, Y., Tchounikine, A., 2002.

A multidimensional and multiversion structure for

OLAP applications. In Proceedings of the 5

ACM In-

ternational Workshop on Data Warehousing and

OLAP (pp. 1-6). McLean, Virginia, USA.

Eder, J., Koncilia, C., 2001. Changes of Dimension Data

in Temporal Data Warehouses. In Proceedings of the

DaWaK’01 Conference, (pp. 284-293). Munich, Ger-

many.

Gupta, A., Mumick, I., Ross, K., 1995. Adapting Material-

ized Views after redefinitions. SIGMOD 95, (pp. 211-

222).

Hurtado, C. A., Mendelzon, A. O., Vaisman, A. A., 1999.

Maintaining Data Cubes under Dimension Updates. In

XVth International Conference on Data Engineering

(ICDE 1999), IEEE Computer Society, (pp.346–

355).Sydney.

Papastefanatos, G., Vassiliadis, PP., Simitsis, A., Sellis,

T., Vassiliou, Y., 2009. Rulebased Management of

Schema Changes at ETL Sources. In The International

Workshop on Managing Evolution of Data Ware-

houses (MEDWa), Riga, Latvia.

Quix, C., 2004. Repository Support for Data Warehouse

Evolution. In Proceedings of the International Work-

shop DMDW, Heidelberg, Germany.

Taktak, S., Feki, J., 2012. Toward Propagating the Evolu-

tion of Data Warehouse on Data Marts. In: MEDI

2012. Lecture Notes in Computer Science, Vol. 7602,

Springer Verlag, Berlin Heidelberg (pp. 178–185).

Poitiers, France.

Zouari, I., Ghozzi, F., Bouaziz, R., 2008. Impact de

l’évolution de nomenclature sur le versionnement des

entrepôts de données. Ingénierie des Systèmes

d'Information, volume 13, (pp. 85-114).

Oueslati, W., Akaichi, J., 2011. A Multiversion Trajectory

Data Warehouse to Handle Structure Changes, Inter-

national Journal of Database Theory and Application,

Vol. 4, No. 2. (pp. 35-50).

HowtoGuaranteeAnalysisResultsCoherenceafterDataWarehouseSchemaChangesPropagationtowardsDataMarts?

435