operations can lead form facts to information, step,
like classification (e.g., mapping a blood pressure
value to classes like “high”, “normal”, or “low) or
aggregation. Prediction functions use application
knowledge (pre-modeled or learned by data mining
algorithms) and derive probable future states from
present facts. Such trends and predictions are
application-relevant information, too.
When information is only generated within the
application layer, it can hardly be shared between
applications. Thus, we propose to introduce a shared
information layer in the data management system
that can be directly queried by applications, similar
to views in databases.
Situation Layer: On the top layer, we define
situations as a relevant combination of facts and
information that needs to be communicated to
subscribed applications. Typically, only the change
of state is communicated, i.e., in the moment when
the situation occurs. Situations could be modeled as
complex events (i.e., patterns evaluated on basic
events) or as continuous queries (i.e., a query that is
continuously evaluated). When archived, a situation
would become part of the information layer.
The concept of semantic information layers is
somehow similar to the well-known concept of
views in data-bases and can thus help us in similar
ways: First, since application-specific higher level
information if explicitly modeled and expressed, it is
easier for the application developer to communicate
with the domain expert and to implement new
queries. And secondly, the layers act as abstraction
levels within the system design, so that the lower-
level processing can be changed without changing
higher-level processing and models.
However, there are also significant extensions to
simple views: we see the need for a much richer set
of operations to express the derivation of higher-
level information, like prediction of future states,
classification, or aggregation. Such operations also
encode application-specific knowledge, represented
in models that are either specified by the domain
expert or are learned by data mining techniques over
persistant data.
3 FEDERATED ARCHITECTURE
From the discussion of the scenario, we see that
there is a need for data management systems that
support both the efficient management of high vol-
umes of stored data, and the processing support of
high-performance streaming systems. To leverage
the benefits of both systems, we propose a federated
architecture. Note that in future data management
systems, both sides may be integrated into one pro-
cessing engine; however, for this, the challenges of a
dual system have to be resolved, too.
Figure 1 shows an overview on the proposed
architecture. Applications can issue continuous
queries or define information models (needed for
classification and aggregation) at the federation
layer.
Here, these queries are transformed into
executable query plans in the underlying systems,
which are a DBMS and a DSMS. At each of the data
processing layers, queries and data can be
exchanged between the two systems.
Note that this is a streaming system; thus, the
query plans are not executed just once, but
deployed/registered to the underlying systems.
Whenever new data arrives, the queries are executed
again with this new data. If the query represents a
continuous query, the new result set is
communicated to the application. If it is a complex
event pattern, the new data is treated as new basic
events, and the systems check whether new complex
event evaluates to true. Both cases are covered by
the concept of “situations”; thus, the result every
application query belongs to the upper most
semantic information layer.
Since the registered queries are typically long-
running, query sharing plays a crucial part for
optimizing the performance. For every new query,
an ideal optimizer at the federation component
would recognize which already running query plans
could be re-used. However, since cross-platform
optimizations over complex query plans might be
too expensive or not possible, the semantic
information layers provide another benefit: they
already represent sharable query plans, since every
modeled concept at the information comes with a
query plan to derive it. If multiple queries use the
same information concept, the system can re-use this
query plan for both situations.
4 IMPLEMENTATION
In order to realize the federation layer depicted in
the architecture from Figure 1, we can leverage
existing techniques provided by the underlying sys-
tems. The arrows between the two subsystems indi-
cate the data transfer that should be supported for
each identified semantic layer. To this end, compati-
ble operators and DB techniques have to be identi-
fied that allow for resuming the data processing task
coming from the DSMS resp. DBMS subsystem.
AA nternational onference on ata echnoloies and Applications