The quality of a schema has been addressed by
the definition of a set of constraints at the model
level. That is, several researchers (cf., (Hurtado,
2002),(Lechtenbörger, 2003), (Ghozzi, 2003) have
defined a set of rules that a schema must respect in
order to produce either a syntactically correct
schema, or a sound schema with respect to data
instances. The rules are defined on the structures and
structural elements of a schema. Nevertheless, these
works have not proposed a mechanism to validate
and verify these rules.
In this paper, we present the first steps towards
the development of a formal framework for DM/DW
modeling and verification. On one hand, this
framework relies on the precise definition of the
constraints ensuring the syntactic and semantic
correctness of a DM/DW schema. On the other
hand, its exploits the formal definition in order to
provide for a means to verify both types of
correctness. More specifically, in this paper, we first
present the formal definition of the Hierarchy
concept at the meta-model level in the Z language
(Spivey, 1992); secondly, we illustrate how the
constraints can be instantiated for a particular model
and verified using the Z/eves theorem prover
(Saaltink, 1999).
The remainder of this paper is organized as
follows. In Section 2, we first overview current
proposals of constraints for DM/DW schemas;
secondly, we present our approach of constraint
definition and verification. In Section 3, we present
the set of constraints pertinent to the hierarchy
concept and their formalization in Z. In Section 4,
we show how to instantiate the constraint for a
particular model and how to verify the correctness of
the constrained model through Z/eves. Finally,
Section 5 summarizes our contributions and outlines
ongoing work.
2 RELATED WORKS
During our survey of the previous works in this
domain, we field studied the hierarchy concept and
the constraints related to this concept. Defining the
hierarchies classification of certain dimension
attributes is crucial because these classification
hierarchies provide the basis for the subsequent data
analysis. Since a dimension attribute can also be
rolling up to more than one other attribute, multiple
classification hierarchies and alternative path
hierarchies are also relevant (Trujillo, 2001).
According to (Lehner, 1998), in the context of
statistical databases and on-line analytical
processing as well, classification hierarchies provide
a basis for defining aggregate data. (Part, 2006)
confirms that hierarchies are crucial to
multidimensional modeling since they are used in
conjunction with aggregation functions to aggregate
(“rollup”) or detail (“drill-down”) measures. These
quotations prove the importance related to the
hierarchy concept. In (Malinowski, 2004), the
authors present a conceptual classification of
hierarchies and propose graphical notations for them
based on the ER model. With respect to dimensions,
every hierarchy classification level is specified by a
class. An association of classes specifies the
relationships between two levels of a hierarchy
classification. The only prerequisite is that these
classes must define a Directed Acyclic Graph
(DAG) rooted in the dimension class (constraint
{dag} placed next to every dimension class). The
DAG structure can represent both alternative path
and multiple hierarchies classification (Lujàn, 2002).
In the GMD model (Franconi, 2004), the authors
describe the hierarchy by an order function between
the different dimension attributes. (Abello, 2006)
presents a multi-dimensional model object oriented,
and defines a hierarchy as aggregation relation
between the different dimension attributes.
Otherwise, few works formally define
hierarchies but they mainly discuss the
summarizability conditions and offer some solutions
to correct measure aggregations in presence of the
so-called heterogeneous hierarchies (Hurtado, 2001).
In (Hurtado, 2002), the authors propose a set of
constraints to solve the aggregation problem. These
constraints are related to the hierarchical structuring
of the dimension attributes and the dimension
instances. We note an explicit and complete
definition of the hierarchy concept in the works of
(Ghozzi, 2003). These works will make the basis of
our formal specification.
The constraints expressed in these works differ
from a model to another. This difference resides, on
the one hand, in the level of expression of the
constraint (Meta-model, model) and on the other
hand, in the level of checking or safeguarding of the
constraint. Moreover, there is no consensus on the
whole constraints to take into account. This
dissension on the level of the constraints expression
in these various works poses a true problem
touching with the coherence of the data to
incorporate. In other words, it can lead to incoherent
results of analyses.
The goal of our work is to lead to a consistent
formal specification of a multidimensional Meta-
model in constellation. Thus, we offer the designers
a means to check their models.
ICEIS 2008 - International Conference on Enterprise Information Systems
318