ple strategy that exploits the large quantity of main
memory which is lately provided by almost any per-
sonal computer. The main memory indices can be
constructed on-the-fly to support query evaluation.
The construction of indices is based on standard in-
dex selection rules available from any database text-
book. For instance, hash-based index is used when-
ever a larger tables has to be joined. In this way we
achieve fast performance of query processor for rela-
tively large quantities of data. Furthermore, the user
does not need to concern about the details of query
evaluation process.
The paper is organized as follows. The following
section presents the data model used for the represen-
tation of data from Web data sources. Section 3 in-
troduces the basic operations of algebra and presents
the examples of queries. The main portion of pa-
per presents the implementation of algebra. Section
4 describes storage manager, parser, query represen-
tation, query optimization and evaluation. Section 5
overviews the empirical results of query execution in
Qios. Finally, concluding remarks are stated in Sec-
tion 6.
2 ALGEBRA
The data model used for the representation of data
stored at different types of data sources must meet the
following requirements. First, the data sources pro-
vide various types of data including semi-structured
data, XML data, (flat) relational tables, and ob-
jects represented by object-relational database mod-
els. Third, we expect that besides the extensional
data, the data sources will include large amounts of
intensional data describing the structure and the con-
tents of the extensional databases.
The F-Logic data model was used as the formal
basis of the system (Kifer et al., 1995): it was shown
(Savnik et al., 1999) that it can serve as the basis for
the representation of the semi-structured data, it pro-
vides a convenient representation of the relational and
the object-relational database models, it can be used
for the representation of complex objects, and, it can
be used very naturally to represent intensional data.
The operations for inquiring about the basic prop-
erties of objects which relate to the representation of
objects are called model operations. Besides the stan-
dard comparison operations =, >, >=, <, <=, the set
operations ∈ and ⊆, and the component selector op-
erator ”.”, which are defined in relational algebra, the
algebra includes the following model operations. The
operations ext and exts map class objects to the sets
of theirs members, or the set of their instances, respec-
tively. Next, the operation class of allows for the
mapping of the ground objects to their parent classes
1
.
Further, the poset comparison operations ≺
o
,
o
,
o
and
o
are used to relate objects with respect to the
partial ordering relationship defined among objects.
The operations subcl and supcl map class objects
to a set of their subclasses or super-classes, respec-
tively. Finally, the operation =˜ is defined for search-
ing the text using regular expressions. The operation
=˜ is defined as in the Perl programming language.
The declarative operations of the algebra are used for
the manipulation of the sets of objects. The detailed
presentation of the model and declarative operations
can be found in (Savnik et al., 1999). The following
groups of declarative operations are defined in the al-
gebra.
Relational Operations. These operations include
standard relational operations select, project,
union, differ and join which are extended for the
manipulation of objects.
Nested-relational Operations. The operation
group(s,a,b) is defined similarly to the SQL
group-by construct. It groups objects from s by the
values of attributes from the set a. The values of the
attributes which are not in a are stored as the relation
which is the value of the attribute b. The operation
unnest(s,a) is used to unnest a set valued attribute
a of objects from the argument set s.
Object-restructuring Operations. The operation
collapse(s,a) collapses the tuple structured at-
tribute a nested in the objects from the set s. The
operation flatten(s) is used for collapsing the set
of sets s.
Operation apply (Buneman and Frankel, 1979). The
functional operation apply(s, f ) is used for the ap-
plication of a query expression f to a set of objects s.
This operation is useful for the application of a query
to the sets of objects that can be located at different
sites.
3 QUERY EXECUTION SYSTEM
Qios (v0.9) is a system for the manipulation of data
from internet data sources. The system is intended to
serve as the lightware kernel of a data manipulation
server. The main aims in the design of Qios are to
provide: capabilities to manipulate collections of data
in a fast manner, various data manipulation functions
from classical querying to data restructuring, and, the
1
A ground object can have a single parent class.
ICEIS 2008 - International Conference on Enterprise Information Systems
92