
queries over them. We then consider eight evolu-
tion operations, covering insertion and deletion of di-
mensions, levels, dimension attributes, and measure
attributes. For each operation, the formal seman-
tics of the changes for star and snowflake schemas
are given, and instance adaption and impact on ex-
isting browse and aggregate queries are described.
As previous work has already dealt with evolution in
the data instances, we do not cover this aspect. Fi-
nally, we compare the evolution properties of star and
snowflake schemas, looking both at which changes
cause existing browse or aggregate queries to fail, as
well as the ease of implementation and instance adap-
tion. In all cases, the star schema is either superior
to the snowflake schema or has the same problems.
We thus conclude that the star schema is consider-
ably more robust towards schema changes than the
snowflake schema.
A lot of previous work on general schema evolu-
tion has been performed (Roddick, 1992). However,
this work does not take the special characteristics of
multidimensional schemas and queries, e.g., dimen-
sions with hierarchies, into account. Several mul-
tidimensional models based on, e.g., UML (Lujan-
Mora et al., 2002) and E/R (Sapia et al., 1998) have
been proposed for conceptual and logical DW design
Some models (Pedersen et al., 2001a; Body et al.,
2002; Eder et al., 2001; Letz et al., 2002; Morzy
et al., 2003) also consider temporal aspects of the
data instances such as multiple version of dimensions
and facts, but not schema evolution in the DW. How-
ever, none of these models consider schema evolu-
tion. The paper (Levene et al., 2003) investigates
other properties of star and snowflake schemas such
as redundancy, but does not consider schema evo-
lution. Several papers (Bouzheghoub et al., 2000;
Vassiliadis et al, 2000) consider DW schema evolu-
tion at the coarse-grained level of materialized views,
either focusing on DW design (Bouzheghoub et al.,
2000) or DW quality (Vassiliadis et al, 2000), but do
not consider the special characteristics of multidimen-
sional schemas and queries. Another paper (Mendel-
zohn et al., 2000) considers querying across schema
evolution, but in a “pure” multidimensional model,
and thus do not investigate evolution in star and
snowflake schemas. The same is true for the FI-
ESTA framework (Blaschka et al., 1999; Blaschka,
2000). Another paper (Hurtado et al., 1999) considers
both schema evolution in a “pure” multidimensional
model, and the mapping to relational schemas, but
considers mainly instance rather than schema updates
(which are only treated briefly), and considers only
dimension updates. In comparison, the present paper
provides a detailed treatment of schema evolution, in-
cluding a formal semantics of the changes, provides
details on the impact on existing DW queries, and
considers both dimensions and fact/measures. Kim-
ball (Kimball, 1996; Kimball et al., 1998) infor-
mally mentions that schema changes should be added
“gracefully” to allow existing queries to work, but
neither investigates this systematically and formally,
nor compares the evolution properties of star and
snowflake schemas.
We believe this paper to be the first to systemati-
cally investigate the evolution properties of star and
snowflake schemas, including giving a formal seman-
tics for evolution operations on these schemas, and
investigating the impact on existing DW queries.
The rest of the paper is structured as follows. Sec-
tion 2 describes the multidimensional schemas and
queries considered in the paper. Section 3 examines
effects of the eight evolution operations, while Sec-
tion 4 summarizes the evolution properties for star
and snowflake schemas. Finally, Section 5 concludes
and points to future work.
2 MULTIDIMENSIONAL
SCHEMAS
We now formally define star and snowflake schemas.
In the following, we refer to the intuitive “MD-
schema” that corresponds to our conceptual under-
standing of the cube that the star or snowflake schema
implements, i.e., a schema with notions like cube,
measures, dimensions, levels, and attributes.
Definition 1 (Star schema) A star schema is a 6-
tuple: <D, fact, A, attrib, PKD, PKF> where D is
a set of dimension tables, fact is a fact table, A is a
set of attributes each with a name and a type, attrib:
A → D ∪ {fact} is a function mapping an attribute
to either a fact table or a dimension table, PKD ⊂ A
is the set of dimension table primary keys where the
primary key of the dimension table d
i
∈ D is denoted
PKd
i
: PKd
i
∈ PKD, and PKF ⊂ A is the (compos-
ite) primary key of the fact table. Each K ∈ PKF is a
foreign key to exactly one dimension table d
i
∈ D. ¤
The star schema uses denormalized dimensions
which means that the dimension levels in the MD
schema do not appear in the star schema. Instead a
dimension table d
i
in the star schema definition con-
sists of one of the base dimension levels defined in the
MD schema and all the dimension levels above it in
the MD schema. The attributes a
j
, j=1..n of d
i
cor-
responds to the attributes of the base dimension level
and all its preceding dimension levels just described.
Definition 2 (Snowflake schema) A snowflake
schema is an 8-tuple: <L, B, fact, A, attrib, PKL,
FKL, PKF>, where L is a set of dimension level
tables, B ⊂ L is a set of base dimension level tables,
fact is a fact table, A is a set of attributes, attrib: A
→ L ∪ {fact} is a function mapping an attribute to
ICEIS 2004 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
426