A large number of papers have investigated various
facets of mapping, such as mapping discovery,
mapping definition or mappings usage (for a survey
see (Rahm and Bernstein, 2001)).
In such a distributed setting, we believe that an
a-priori agreement on knowledge and knowledge
exchange is very hard to achieve. Indeed, if we try
to achieve integration or interoperation of large and
disparate information systems, the current standard
approach of creating large-scale shared knowledge
will hardly scale up to the size of the (semantic) Web,
and is also conceptually problematic because in our
opinion knowledge is never context-free (Yaacoubi
and BenAhmed, 2003; Stoœmer and Stecher, 2005),
and can thus never be perfectly shared.
In this work, our objective is to propose a com-
plete approach for the semantic integration of Gen-
eralization Hierarchies. We adapt previous results
on schema and ontology integration (ontology fu-
sion, ontology mapping, ontology alignement for a
survey, see (Wache et al., 2001)) to tackle different
kinds of heterogeneities one might encounter during
the interoperation of information systems. Indeed, we
think that the semantics of schema models is not ex-
plicit but is hidden in their structures and label’s con-
cepts. Given a set of generalization hierarchies, our
approach gives much emphasis on semantics added-
value by ”emerging” the intended informal meaning
of their concepts through mapping them to Wordnet
1
ontology, but also through interpreting their structural
position.
The aim of this paper is to describe an algorithm
to analyse the implicit knowledge in order to pro-
vide correct mappings between concepts. First, we
propose a logical formalization of class hierarchies.
Thus, we provide a rigorous logical framework for
representing and automatically reasoning on general-
ization hierarchies except their formalism (UML, ER
diagram, etc). The SEM-INTEROP algorithm per-
forms two main steps : semantic interpretation and
semantic comparaison.
Compared to other related works, our proposal falls
within the scope of approaches that aim at defining a
formalism or methodology to specify and use inter-
schema correspondences. We can assume that an ini-
tial set of inter-schema correspondences given by the
designer, however we don’t consider the subject of
query reformulation, which is out of the scope of this
paper. The proposal contributes to the area of research
on the following original topic :
• A semantic interpretation approach combining lin-
guistic, structural and contextual knowledge is pro-
posed in order to be able compare semantically
1
Wordnet is available at http://wordnet.princeton.edu.
concept’s hierarchies,
• We propose a mapping algebra that can be in-
tressent to realize schema transformations.
The paper is structured as follows : Section 2 presents
logical constructs for generalization hierarchies. In
section 3, we present our semantic-based approach for
interoperability, we describe the first version of the
SEM-INTEROP algorithm. Finally, Section 4 con-
cludes the paper and identifies future works.
2 BASICS OF THE APPROACH
Let us first clarify our terminology. In the litterature,
we identify four levels of abstractions. At the bottom
level we have actual data (or instances) organized
according to a variety of (semi) structured formats
(relational tables, XML documents, HTML files, sci-
entific data, and so on). At the second level we have
schemes, which describe the structure of instances (a
relational schema, a DTD, an XML schema or one of
its dialects, etc.). Then, we have different formalisms
for the description of schemes that we call models
(e.g. conceptual model like the ER model or UML
class diagram). Finally, we use the term metamodel
to mean a general formalism for the definition of
various models. Specifically, a metamodel is made
of a set of metaprimitives. Each metaprimitive
captures a class of constructs of different data
models that share a common characteristics or, more
precisely, that implement, possibly with different
names, the same basic abstraction principle (Torlone
and Atzeni, 2001). Examples of metaprimitives :
class, attribute, definition domain, relationship, gen-
eralization, disjoint union, key, foreign key, and so on.
Here, we introduce more specifically and formally
the terms of our problem. As conceptual model, we
opt for Generalization– its inverse: specialization–
Hierarchies. We propose a logical formalism that al-
lows us to uniformly represent heterogeneous hierar-
chies.
Definition 1 (Generalization hierarchy) We define
a class hierarchy H as a triple hC, E,Φi:
• C is a finite set of classes, C={c
i
}, each class c
i
is characterized by a name and a set of attributes,
c
i
=hn
c
i
, A(c
i
)i. Each attribute a
h
∈ A(c
i
), with
h=1,...,n is defined as a pair, a
h
=h n
h
, d
h
i, where
n
h
is a name and d
h
is the domain associated with
a
h
, respectively.
• E is a set of arcs on C, for instance, E is a set of
subsumption relationships (ISA relationships) be-
tween classes.
ICSOFT 2006 - INTERNATIONAL CONFERENCE ON SOFTWARE AND DATA TECHNOLOGIES
140