spaces can be seen as “intelligent object containers”.
They make use of a subset of conceptual dimensions,
a specification of relevant abstraction levels and a
set of atomic data combination functions in order to
(re)construct appropriate generalized instances. Dif-
ferent statistical indexes (e.g. frequencies), associated
to the conceptual dimensions, can be used to char-
acterize patterns in the conceptual spaces. The most
suitable data analysis technique for carrying out this
proposal is data warehousing.
The present work is not the first attempt to formal-
ize a data warehouse for the Semantic Web. Within
the Data Warehouse Quality (DWQ) project (Hacid
and Sattler, 1998) a formalization for the multidimen-
sional modeling based on an extension of the con-
structors of description logic is proposed. In this way,
new object classes could be described by specify-
ing aggregability operations, and the traditional rea-
soning over ontological instances could be applied.
However, the demonstration of the undecidability of
minimal languages that operate with aggregate opera-
tors(Baader and Sattler, 2003) makes the proposal of
the DWQ project unfeasible.
On the other hand, the ideas of the traditional data
warehouse (and OLAP techniques) has been extended
to object oriented modeling, (Buzydlowski et al.,
1998; Trujillo et al., 2001; Nguyen et al., 2000; Binh
and Tjoa, 2001; Abell
´
o, 2002). Considering that de-
scription logic was designed as an extension to frames
and semantic networks, the basis of object-oriented
data warehouse could be applied in order to define
a data warehouse for the Semantic Web. However,
the flexibility of object-oriented formalization causes
a more sparse structure in object-oriented databases
that in traditional ones. Moreover, the restrictions of
OLAP implementations drastically reduce the useful
set of objects to be used in the analysis.
Unlike these previous works, this paper proposes
a multidimensional model for the analysis of ontolog-
ical instances that merge both approaches. The idea
is the creation of meta-ontologies in order to enrich
the knowledge of ontologies with data analysis infor-
mation. This data analysis information focuses on the
description of interesting object classes and on the ag-
gregation process. The reasoning of description logic
is used in a preliminary phase to 1) recover the satisfi-
able
2
object classes that can be used on analysis pro-
cesses, 2) discover the hierarchical and aggregate or-
ders between the classes, and 3) assign each instance
to the set of object classes to which it belongs.
This paper describe our proposal in detail. Firstly,
2
A concept (or object class) is satisfiable if it is consis-
tent and there exists an interpretation on which appears at
least an instance of this concept.
the data analysis information is introduced. Then, the
proposed model is described, starting from the defini-
tion of dimensions and their operators (section 3) and
following with the specification of the multidimen-
sional conceptual space (section 4). The two follow-
ing sections are focused on the extraction of interest-
ing conceptual spaces and their use, respectively. The
last section gives some conclusions and future work.
2 ANALYSIS METADATA
Information descriptions useful for the analysis are
those available in the ontologies in form of instances.
However, they are not enough to analyze data and dis-
cover patterns. New interesting concepts and partic-
ular issues related to the generalization process are
essential in order to generate descriptions that repre-
sent relevant and realistic visions of the application
domains of the analyzed ontologies. We call all this
information analysis metadata, which comprises the
following elements:
• description of new concepts, which it is used to in-
troduce additional levels of abstraction in the con-
cept hierarchies expressed in an ontology, and/or
to link concepts from different ontologies. New
concepts may be obtained extending old ones with
paths to previously unrelated concepts. They can
also semantically represent hierarchical clusters
obtained using clustering algorithms.
• description of the combination functions (see def-
inition below); it is used to specify ways for gen-
eralizing sets of data of the same type during the
instance generalization process. The data analyst
is responsible for deciding the combination func-
tions that are semantically suitable for a given data
set. For example, the combination function which
computes the average of a set of values is seman-
tically suitable for a temporal sequence of temper-
atures of a town, but not for a set of temperatures
of different towns.
Although it is perfectly plausible to define such
descriptions for every new multidimensional concep-
tual space, a better solution is to keep this semantic
information always available and to apply it accord-
ing to the requirements of each case. This goal can be
achieved building a meta-ontology containing the sort
of information described above, again using Descrip-
tion Logic. In this way, analysts can proceed more ef-
ficiently as they can reuse the analysis metadata. Even
more importantly, in this way the coherence of differ-
ent studies is granted, providing an ontology with an
intrinsic robustness toward analysis processes. Thus,
ICSOFT 2007 - International Conference on Software and Data Technologies
14