Note that the second sub-query needs the Gene
Name (Gn) from the first one. Thus, the evaluation
must be sequential. Once results of first sub-query
are received, the system eliminates possible
inconsistencies and duplicates, and makes use of
these results into the second sub-query. Finally the
system translates results to ontology instances and
returns them to user application.
4 DISCUSSION
In this paper we describe an architecture for
conceptual mediation based-on a P2P and Web
Services architecture that presents advantages with
respect to traditional mediator system approaches.
The semantics introduced in the Semantic
Directories allows users to make more expressive
queries, viz. semantic queries that other mediators
cannot solve. They greatly increase the query
capabilities of this type of mediator. Besides, the
mappings between schemas and the domain
ontology of a semantic directory provide support for
solving these queries over the available resources.
The use of an architecture like P2P introduces a
high level of uncoupling between wrappers and
mediators or any other application, which could
involve wrappers. The directories supply an easy
way to integrate data sources and open new
directions in dynamic integration. Besides, our
proposal entails additional profits, such as the reuse
of wrapper components, access to data services for
other applications, use flexibility, etc. It provides
elements to obtain major interoperability among
integration systems that cooperate in the same
application domain or that belong to other domain
with which they have certain relationships.
In several domains in which there are no
“technological” users, such as biologists, dynamic
integration is a very important issue. In this context,
it is necessary to give users a simple environment for
integrating data information without the
modification of the mediator’s code or of the
integration schema. Using a domain ontology, users
can design queries starting from specific knowledge
that belongs to their field of research. However, the
proposed architecture requires that somebody
implement wrappers and publish them in the
semantic directories.
From our point of view, CBSD technology
(Szyperski, 1998) can help automatic generation of
wrappers, allowing us to configure data source
accesses as well as to choose appropriated
algorithms for each task. By applying this
technology users can generate wrappers just by
knowing the resources in which the information can
be found. Note that these resources are well known
by users, so they make use of them in their daily
work.
The case for use in biology is evidence of the
suitability of this kind of architecture for the
bioinformatics domain. In particular, the usefulness
of a mediator system is demonstrated by a diverse
set of applications aimed at combining expression
data with genomic, sequence-based and structural
information, so as to provide a general, transparent
and powerful solution that goes beyond traditional
gene expression data clustering.
Our architecture opens new ways to address
interesting issues, such as query decomposition
algorithms, result integration, data service location,
searches in data directories, etc.
5 FUTURE WORK
As future works, we propose several lines of
mediator development. Increasing the automation
level in wrapper creation and adding to this process
data service generation previously described, which
will reduce even more their cost of development.
Another interesting line stems from studying the
possibility of giving more semantics to these data
services, taking into account service quality,
relations with other domain ontologies, etc. Using
this additional information, we could generate
alternative query execution plans, allowing
applications to choose the one which is more
suitable or generating them based on certain features
(using local resources for the application location).
Besides, the scalability of this architecture will
provide the possibility of integrating not only data
services but also semantic directories, making
possible a full semantic integration of resources and
the interoperability between applications. Thus, we
will provide elements to achieve interoperability
between semantic integration systems that cooperate
in the same application domain or have certain
relations (Semantic Fields). Furthermore, we will
introduce a solution to integrate semantic fields and
obtain better query capabilities.
We plan to study automatic mapping between
schemas and ontologies taking into account a
previous experience (Madhavan, 2002). It can be
applied to establish correspondences between
retrieved document schemas and directory
ontologies in those new systems developed using our
architecture. Finally, we are interested in
establishing a semantic model to define the data
service’s query capabilities, which improves query
planning by adding inferences about query
TOWARDS CONCEPTUAL MEDIATION: A semantic architecture for dynamic integration of heterogeneous databases
175