independent variables x
1
, x
2
, …, x
n
. In this paper we
apply comparative statics in the OLAP context
where we have a system of linear or nonlinear
equations with dependent variables on an aggregated
level of the cube, called non-base variables and
independent variables on the base level, called base
variables.
This research is part of our continued work on
extensions for the OLAP framework for business
diagnosis. Current OLAP databases have limited
capabilities for sensitivity and diagnostic analysis.
The goal of our research is to largely automate these
manual diagnostic discovery processes (Caron and
Daniels, 2007). In (Sarawagi et al., 1998) a similar
research approach is taken.
The remainder of this paper is organized as
follows. Section 2 introduces our notation for multi-
dimensional equations, followed by formal
description of consistency and solvability of systems
of OLAP equations in Section 3. In addition, we
show that systems of OLAP equations are consistent
and have a unique solution. In Section 4 the OLAP
framework is extended with sensitivity analysis
based on the consistency property. Subsequently, we
briefly describe a software implementation of our
model for OLAP sensitivity analysis. Finally,
conclusions are discussed in Section 6.
2 NOTATION AND EQUATIONS
The multi-dimensional OLAP database is a
framework used to provide business decision-makers
with the ability to perform dynamic data analysis.
With OLAP tools, users gain access to the data
warehouse. Decision-makers tend to have questions
that are often multi-dimensional in nature and
demand fast access to large amounts of aggregated
data. A typical business question might be: ``What
was the profit of product A this year, in region X,
per sales office, compared with the previous version
of the product, compared to the targeted profit?'' For
decision-making purposes it might be necessary that
the answer to this question is explored further, for
example on the quarter, month and week level. This
functionality is provided by OLAP.
Two important data schemata for the design of a
multi-dimensional database are the star schema and
the snowflake schema. OLAP typically uses a star
schema, where data is stored in fact tables and
dimension tables. In a star schema, one central fact
table is linked via foreign keys with several
dimension tables. Each dimension has its own single
table with a smaller set of data. The other important
multi-dimensional design approach, the snowflake
schema, is a non-redundant database design that
characterises itself by the normalized data approach
where data is further split into additional dimension
tables (Han and Kamber, 2005).
In both schemata data is organized using the
dimensional modelling approach, which classifies
data into measures (i.e., facts) and dimensions.
Measures are numeric and dimensions are
categorical data types. Measures like are the basic
units of interest for analysis. Dimensions correspond
to different perspectives for viewing measures.
Dimensions are usually organized as dimension
hierarchies, which offer the possibility to view
measures at different dimension levels (e.g. month
p quarter p year is a hierarchy for the Time
dimension). Aggregating measures up to a certain
dimension level, with functions like sum, count, and
average, creates a multidimensional view of the data,
also known as the data cube. A number of data cube
operations exist to explore the multidimensional data
cube.
Here we use a generic notation for multi-
dimensional data schemata that is particularly
suitable for combining the concepts of measures,
dimensions, and dimension hierarchies as described
in (Caron and Daniels, 2007). Therefore, we define a
measure y as a function on multiple domains:
12
12
12
:
nn
ii i i
ii
n
yDD D××× →
K
K R
(1)
Each domain
i
D has a number of hierarchies ordered
by
max
01
i
kk k
DD DppKp
, where
0
k
D is the lowest
level and
max
i
k
D is the highest level in
max
i
k
D . A
dimension’s top level has a single level instance
}
max
All
i
k
D = . For example, for the time dimension
we could have the following hierarchy
01
TTp
2
Tp , where
}
2
T All-T= ,
{}
1
T 2000,2001= , and
}
2
Q1,Q2,Q3,Q4T = . A cell in the cube is denoted
by
12
(, , , )
n
dd dK , where the 's
k
d are elements of
the domain hierarchy at some level, so for example
(2000, Amsterdam, Beer) might be a cell in a sales
cube. Each cell contains data, which are the values
of the measures y like, for example,
211
sales (2000,
Amsterdam, Beer). The measure’s upper indices
indicate the level on the associated dimension
hierarchies. If no confusion can arise we will leave
out the upper indices indicating level hierarchies and
write sales (2000, Amsterdam, Beer). Furthermore,
the combination of a cell and a measure is called a
data point. The measure values at the lowest level
cells are entries of the base cube. If a measure value
ICEIS 2009 - International Conference on Enterprise Information Systems
326