questions are answered by combining several data
sources and by performing calculations within the
queries.
The available data sources can be categorized
into the following four domains: Political regions,
demography, company data and branch data for
categorizing companies. The original data sources
are collected in six tables. Additionally, political
maps are available as shape-files (ESRI, 1998). They
are imported from csv files, where each row defines
the subject URI, each column the predicate URI and
the cell value the object literal. The cell value can
also be converted to an own URI.
In this paper, the dynamic data model will be
explained with an example of the domain “Company
data”. Companies have provided financial data like
profit, asset and revenue. The example is based on
the querying of some attributes of that company
data. In the original input table, revenue is available
as pairs of “year1” and “revenue1”, “year2” and
“revenue2” and so on. “Asset” and “Profit” refer to
the latest year of revenue, which is “year1”.
2.2 Potential Usage of Ontologies
The source data structure uses several predicates and
subjects for the same thing, in this case pairs of
attributes like “year1” and “revenue1”. One of the
advantages of ontology-based models is the potential
use of reasoners for classifying data. In projects like
HarmonISA (HarmonISA, (n. d.)) the task is to
classify land types (grassland, forest, sea). The main
task for the reasoner is to classify new data from
different sources into a skeleton ontology, see
(Peedell et al., 2005) based on its attributes. This is
used to merge data sources and models and to query
the whole system, which contains several models
with a single query. In the query, everything that
fulfills certain criteria is queried independent of the
reference model.
Another application of reasoners is presented in
(Fallahi et al., 2008, p. 354). In this service oriented
architecture, the reasoner is used for matchmaking
of requests to services. Each available service is
modeled in an ontology. The requests, which are
also modeled in the ontology, are more specialized
than the services and are classified into classes by
the inference engine. From the potential services,
which could fulfill the request, the best match is
used for the task.
In our system and in the example with company
data, the reasoner could be used to classify
companies into e. g. small, middle and large sized
companies depending on certain defined criteria. A
company could be defined as anything that has some
values from the classes “address”, “employees” and
“revenue”. However, in order to do that, the
“address” of a company must not be a literal of the
class “company”, which happens at the import of flat
tables (unprocessed in its original form), but it has to
be assigned its own class “address”. A new layer of
hierarchy has to be inserted between the company
URI and the actual address values. The flat data
structure in the database would have to be
normalized like in relational databases; i.e. literals
would have to be transformed to URIs. Otherwise,
an ontology model would be highly populated with
only a few subjects and several predicates
connecting them to objects or in this case to literals.
As long as the data is not processed, this type of
classification does not make much sense here. In
order to still be able to combine the heterogenic data
sources in flat tables, the dynamic data model was
developed.
3 CONCEPT
This data model is based on two main concepts: The
creation of artificial classes and the creation of
database queries by combining elements of subject-
predicate relations. If the content of the model is
queried, a new query is generated with the concepts
of the model and used for the actual database. The
main function of the data model is to provide a
flexible way to automatically generate queries for
the database by considering the requests of the user.
This is done by defining a meta-ontology, which
consists of the following classes: Class,
SubjectClass, QueryConcept, SubQueryConcept,
Group, AtomQuery. The classes and instances in the
meta-model are completely separated from the
classes and instances in the actual database. The
only thing they share is the RDF-database as a
storage medium. In Figure 1, the database is shown
to the left with a class “Class1”, an instance
“Instance1”, two literals “Literal1” and “Literal2”,
which are connected with “Instance1” via the
predicates “Predicate1” and “Predicate2”.
Within the meta-ontology, on the right side of
Figure 1, instances of Class (Meta-Ontology classes
are written in italic) are created, which represent
subdomains in the database and are usually defined
from the predicates in the database, i.e. the
predicates in the actual databases are transferred into
instances of Class in the model. The instances of
Class are independent of the real classes in the
database (to the left in Figure 1). In Figure 1, the real
ICAART 2012 - International Conference on Agents and Artificial Intelligence
440