
processing, reducing the resources required for
holding correlated data in memory or on disk. We
use a container-managed environment that
automatically manages correlation sessions which
enables scalability and significantly decreases the
effort for developers to utilize correlation sessions.
The remainder of this paper is organized as
follows. In section 2, we give an overview of
approaches for event correlation developed in the
research community and the industry so far and
discuss our contribution. In section 3, we introduce
and define correlation sessions. In section 4, we
present an architecture for a container-based
environment which allows to integrate and correlate
events from various sources and which
automatically manages the lifecycle of correlation
sessions. Section 5 shows an example of an
application that utilizes correlation sessions for
continuously calculating the cycle time of business
processes. Finally, section 6 concludes the paper.
2 THE ROLE OF CORRELATION
FOR EVENT MANAGEMENT
The emergence of e-business has forced many
organizations to improve operational efficiency,
turning their attention toward real-time business
activity monitoring (BAM) solutions. These
initiatives require enterprise data to be captured and
analyzed in real-time from a wide variety of
business applications, operational data stores and
data warehouses. While traditional data integration
approaches, built on top of core ETL (extraction,
transformation, loading) solutions are well suited for
building historical data warehouses for strategic
decision support initiatives, they do not go far
enough toward handling the challenge of
continuously integrating data with minimal latency
and implementing a closed loop for business
processes. Traditional solutions are optimized for
batch-oriented data integration and make the
assumption that large data sets can be extracted from
various source systems in order to transform and
integrate them into a data warehouse environment.
The correlation of data for these scenarios is a minor
problem since the data integration tools are always
able to access the entire data sets.
However, when it comes to continuous data
integration, where events are propagated and
processed continuously from various source
systems, event data has to be correlated in order to
be able to generate business metrics. One single
event includes very little information about the
business process and is therefore too detailed for
monitoring purposes. What is needed is more
granular business information in the form of
representative business metrics that are derived from
a set of raw events. In order to be able to transform
raw events into valuable business metrics a
correlation mechanism is needed that enables the
capture of required event data for calculating a
single business metric. The events of business
processes often have inherent dependencies that
have to be considered during the event processing.
For example, an order process might include
processing steps whose processing times are of
interest to the business. By correlating the events
that indicate when process activities started and
completed, a calculation of the processing times
becomes very straightforward.
Detecting and handling exceptional events also
plays a central role in network management
(Feldkuhn and Erickson, 1989)0. Alarms indicate
exceptional states or behaviors, for example,
component failures, congestion, errors, or intrusion
attempts. Often, a single problem will be manifested
through a large number of alarms. These alarms
must be correlated to pinpoint their causes so that
problems can be addressed effectively. Many
existing approaches for correlating events have been
developed from network management. Event
correlation tools help to condense many events,
which individually hold little information, to a few
meaningful composite events.
Rule-based analysis is a traditional approach to
event correlation with rules in the “conclusion if
condition” form which are used to match incoming
events often via an inference engine. Based on the
results of each test, and the combination of events in
the system, the rule-processing engine analyzes data
until it reaches a final state (Wu et al., 1989).
Another group of approaches incorporate an
explicit representation of the structure and function
of the system being diagnosed, providing
information about dependencies of components in
the network (Katzela and Schwartz, 1995) or about
cause-effect relationships between network events.
The fault discovery process explores the network
model to verify correlation between events.
NetFACT (Houck et al., 1995) uses an object-
oriented model to describe the connectivity,
dependency and containment relationships among
network elements. Events are correlated based on
these relationships. Nygate (1995) models the cause-
effect relationships among events with correlation
tree skeletons that are used for the correlation.
InCharge (Yemini et al., 1996) represents the
causal relationships among events with a causality
graph using a codebook approach to quickly
correlate events to their root causes. The code-book
approach uses a network model to derive a code – a
ICEIS 2004 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
322