interested in the questions: How is the profit on the
aggregated year level affected when the profit for
product P1 is changed in the first quarter in The
Netherlands? Or how is the profit in the year 2007
for a certain product affected when its unit price is
changed (c.p.) in the sales model? Such questions
might be ‘dangerous’, when the change is not caused
by a variable in the base cube, but by a variable on
some intermediate aggregation level in the cube. The
latter situation makes the OLAP database inconsis-
tent. Our novel OLAP operator corrects for such
inconsistencies such that the analysts can still carry
out sensitivity analysis in the OLAP database. Our
research shows that consistency and solvability of
OLAP databases are important criteria for sensitivity
analysis in OLAP databases.
1.1 OLAP Introduction
OLAP databases are a popular business intelligence
technique in the field of enterprise information
systems for business analysis and decision support.
OLAP not only integrates the management
information systems (MIS), decision support
systems (DSS), and executive information systems
(EIS) functionality of the earlier generations of
information systems, but goes further and introduces
spreadsheet-like multi-dimensional data views and
graphical presentation capabilities (Koutsoukis et
al., 1999). OLAP systems have a variety of
enterprise functions. Finance departments use OLAP
for applications such as budgeting, activity-based
costing, financial performance analysis, and
financial modelling. Sales analysis and forecasting
are two of the OLAP applications found in sales
departments.
The core component of an OLAP system is
the data warehouse, which is a decision-support
database that is periodically updated by extracting,
transforming, and loading data from several On-Line
Transaction Processing (OLTP) databases. The
highly normalized form of the relational model for
OLTP databases is inappropriate in an OLAP
environment for performance reasons. Therefore,
OLAP implementations typically employ a star
schema, which stores data de-normalized in fact
tables and dimension tables. The fact table contains
mappings to each dimension table, along with the
actual measured data. In a star scheme data is
organized using the dimensional modelling
approach, which classifies data into measures and
dimensions. Measures like, for example, sales,
profit, and costs figures, are the basic units of
interest for analysis. Dimensions correspond to
different perspectives for viewing measures.
Examples dimensions are a product or a time
dimension. Dimensions are usually organized as
dimension hierarchies, which offer the possibility to
view measures at different dimension levels (e.g.
month
≺ quarter ≺ year is a hierarchy for the Time
dimension). Aggregating measures up to a certain
dimension level, with functions like sum, count, and
average, creates a multidimensional view of the data,
also known as the data cube. A number of data cube
operations exist to explore the multidimensional data
cube, allowing interactive querying and analysis of
the data.
The remainder of this paper is organized as
follows. Section 2 introduces our notation for multi-
dimensional models, followed by a description of
models appropriate for OLAP problem identification
in Section 3. In Section 4 the explanation formalism
is extended for multi-dimensional data in order to
automatically generate explanations. In section 5 we
show that systems of OLAP equations are consistent
and have a unique solution. Subsequently, we apply
this result for sensitivity analysis in the OLAP
context. Finally, conclusions are discussed in
Section 6.
2 NOTATION AND EQUATIONS
Here we use a generic notation for multi-
dimensional data schemata that is particularly
suitable for combining the concepts of measures,
dimensions, and dimension hierarchies as described
in (Caron and Daniels, 2007). Therefore, we define a
measure y as a function on multiple domains:
12
12
12
:
nn
ii i i
ii
n
yDD D××× →
…
… R
(1)
Each domain
i
D has a number of hierarchies ordered
by
max
01
i
kk k
DD D≺≺…≺ , where
0
k
D
is the lowest
level and
max
i
k
D
is the highest level in
max
i
k
D
. A
dimension’s top level has a single level instance
max
All
i
k
D = . For example, for the time dimension
we could have the following hierarchy
01
TT≺
2
T≺ , where
2
T All-T= ,
1
T 2000,2001= , and
0
Q1,Q2,Q3,Q4T = . A cell in the cube is denoted
by
12
(, , , )
n
dd d… , where the 's
k
d are elements of
the domain hierarchy at some level, so for example
(2000, Amsterdam, Beer) might be a cell in a sales
cube. Each cell contains data, which are the values
of the measures
y like, for example,
211
sales (2000,
Amsterdam, Beer). The measure’s upper indices
EXTENSIONS TO THE OLAP FRAMEWORK FOR BUSINESS ANALYSIS
241