NexusDSS: A System for Security Compliant Processing of Data Streams

Nazario Cipriani

, Christoph Stach

, Oliver Dörler

and Bernhard Mitschang

Universität Stuttgart, Institute of Parallel and Distributed Systems, Universitätsstraße 38, 70569 Stuttgart, Germany

Steinäcker 54, 73773 Aichwald, Germany

Keywords:

Accessibility of Data, Privacy Policies, Data Stream Processing.

Abstract:

Technological advances in microelectronic and communication technology are increasingly leading to a highly

connected environment equipped with sensors producing a continuous ﬂow of context data. The steadily grow-

ing number of sensory context data available enables new application scenarios and drives new processing

techniques. The growing pervasion of everyday life with social media and the possibility of interconnecting

them with moving objects’ traces, leads to a growing importance of access control for this kind of data since

it concerns privacy issues. The challenge in twofold: First mechanisms to control data access and data us-

age must be established and second efﬁcient and ﬂexible processing of sensible data must be supported. In

this paper we present a ﬂexible and extensible security framework which provides mechanisms to enforce

requirements for context data access and beyond that support safe processing of sensible context data accord-

ing to predeﬁned processing rules. In addition and in contrast to previous concepts, our security framework

especially supports ﬁne-grained control to contextual data.

1 INTRODUCTION

With the rapidly increasing density of mobile phones

equipped with GPS sensors and mobile Internet con-

nections, the usage of data stream processing is in-

creasing in many application areas. A GPS sensor,

for example, continuously produces a potentially un-

bounded stream of measuring points which makes the

use of data stream processing necessary. Application

areas for data stream processing can be found in so-

cial media applications—such as Facebook, Twitter

and Google+—as well as in location-based services

(LBSs). These applications often augment position

information of mobile devices with personal informa-

tion of the user in real time. The beneﬁts of a LBS is

undisputed and already included in many of today’s

smartphone applications.

More and more of our social and private life is per-

vaded by this kind of applications, which on the one

hand delivers a real beneﬁt in everyday life providing

location-based information which one might be inter-

ested in. On the other hand, however, this raises the

question on how to protect this information against

unauthorized access. For the data owner it is of great

importance to express ﬁne-grained access conditions,

deﬁning which data can be accessed by certain enti-

ties and how this data might be processed by, e.g., da-

ta stream processing systems. A majority of users of

LBSs can also be found in social networks. Social

networking is easy to use and information including

personal details and the current position is available

to a wide audience. This creates a variety of usage

scenarios, but at the same time exposes possibly sen-

sitive information to the public.

The upper part of Figure 1 (Application View)

depicts such a LBS, called Friend Finder (FF). This

sample Application reveals the current location of a

user to all of his friends. Mike broadcasts his GPS

data to FF in our scenario. FF then combines his data

with additional information acquired by third-party

data providers, e.g. Google Maps. Similar services

are offered by many of todays social networks such

as foursquare. However, in these services Mike can

share all of his private information with a user or no

information at all. In contrast, our approach goes a

step further: Although, both of his friends Bob and Al-

ice have access to parts of Mike’s data, Alice receives

ﬁltered information only. E.g., while Bob gets Mike’s

accurate location, Alice gets the country where Mike

is at the moment.

In the lower part of Figure 1 (Stream-Processing

View) the participants of this scenario are mapped to

the nodes of a data stream processing system. Mike’s

GPS and the third-party data providers act as data

175

Cipriani N., Stach C., Dörler O. and Mitschang B..

NexusDSS: A System for Security Compliant Processing of Data Streams.

DOI: 10.5220/0004051401750185

In Proceedings of the International Conference on Data Technologies and Applications (DATA-2012), pages 175-185

ISBN: 978-989-8565-18-1

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: Application scenario illustrating the necessity of

access control in data streaming applications.

sources (S

, S

) while Bob’s and Alice’s mobile de-

vices are the data sinks (T

, T

). The FF LBS pro-

cesses this data (e.g. by combining different sources

or by visualizing the data O

) and ensure that

Mike’s privacy rules are respected. In order to enable

these features, the data stream processing system that

performs data integration and data processing must

provide access control to data (Anderson, R. J., 2008).

Furthermore, it must provide a way in deﬁning a ﬁne-

grained process control to data.

A prerequisite to support a wide range of stream-

based applications—including the application sce-

nario sketched above—is that the data stream pro-

cessing system must be open to further applica-

tion scenarios and provide an integration mechanism

for domain-speciﬁc extensions (Cipriani, N. et al.,

2011a). This particularly means that the required

domain-speciﬁc data and processing techniques —in

terms of operators— should be integrated into an ex-

isting system to exploit already existing functionali-

ties and extend the system only where necessary, re-

ducing functional redundancy. For this purpose we de-

veloped NexusDS (Cipriani, N. et al., 2009), an open,

ﬂexible, and extensible data stream processing system

for distributed processing of streamed context data.

NexusDS bases on Nexus (Nicklas and Mitschang,

2004) as it builds upon its particular data management

and extends it by stream processing capabilities. As

context-data is highly privacy-related, also security

control patterns are essential to control data access

and data processing. Thus, we extended NexusDS by

security control patterns to meet the particular require-

ments of context-aware applications.

The openness of NexusDS in combination with the

requirement of a ﬁne-grained data access and a ﬁne-

grained data processing, as described beforehand, is a

big challenge. Appropriate mechanisms must be de-

veloped to allow for both a controlled access to sensi-

tive context data and a controlled processing of it. For

this purpose we extended NexusDS to NexusDS Se-

cure (NexusDSS). We developed an access control

framework for data stream processing systems, which

allows a ﬁne-grained tuning on which data is acces-

sible and how it might be further processed. The

security framework retains the openness and ﬂexibil-

ity of the original data stream processing system and

allows—depending on the desired level of security—

to determine the settings for ﬁne-grained data access

and data processing.

The remainder of this paper is organized as fol-

lows: Section 2 deﬁnes protection goals and provides

a classiﬁcation for access control and processing con-

trol mechanisms. Related work is discussed in Sec-

tion 3 comparing data stream processing systems with

respect to the control mechanism classiﬁcation from

the previous section. In Section 4 the architecture of

the security framework with all its main components

is described and its mode of operation. After that Sec-

tion 5 presents the security framework which consti-

tutes an open and extensible mechanism to integrate

custom access control and process control. The se-

curity policies in a second step are handled by the

data stream processing system at deployment time

and may change at runtime, as described in Section 6.

In Section 7 a summary and an outlook to future work

concludes this paper.

2 PROTECTION GOALS

The deﬁnition of an access control framework for a

distributed data stream processing system is preceded

by a deﬁnition of safety and security requirements to

data. Depending on this set the actual safe process-

ing issues are designed. Section 2.1 ﬁrst describes

the necessary terminology. After that in Section 2.2

the access control classiﬁcation is shown to ensure—

in the context of this work—safe processing by es-

tablishing a security architecture that allows to deﬁne

access conditions and process conditions for context

data.

2.1 Clariﬁcation of Terms

The access control framework distinguishes two dif-

ferent types of participants: Subjects and objects. Sub-

jects represent entities such as users of a system or a

DATA2012-InternationalConferenceonDataTechnologiesandApplications

176

process running in a system. In contrast, objects rep-

resent entities such as ﬁles, database entries, or exe-

cutable code within a system. Subjects access objects.

However, imagine a subject which wants to change

the access conditions of another subject, such as e.g.

the user from Figure 1 limiting the position sharing

to only family members (assuming subjects can be

grouped according to their family membership). In

this case, the former user modifying the access condi-

tions is the subject whereas the latter user represents

the object. Throughout this work we will refer to the

respective entity type being either a subject or an ob-

ject.

2.2 Classiﬁcation of Protection Goals

Information can be protected under the consideration

of different protection goals, which leads to the so-

called protection targets. These protection targets in-

ﬂuence the actual design and functioning of the sys-

tem or process concerned. The classiﬁcation is build

out of four protection target classes which in turn

consist of a variety of targets. The protection target

classes are: Authentication, Access Control, Process

Control, and Granularity Control.

• Authentication: Covers all targets for the reli-

able identiﬁcation of the relevant subjects and ob-

jects which are participating in the system. These

include the authenticity of subjects and objects

which must have the necessary rights to join the

system as well as the action liability, which as-

signs each action to a speciﬁc subject. To make

these actions traceable a storage area for trace in-

formation must be provided.

• Access Control: Covers all targets that play an es-

sential role for access control. Here a target that is

of crucial importance is the data integrity, as this

ensures that objects cannot be changed uncontrol-

lably and therefore guarantee that only subjects al-

lowed to make changes will be able to access this

data. Furthermore, the conﬁdentiality of informa-

tion must be ensured in order to hide information

from subjects who may not be allowed to read this

information.

• Process Control: Covers all targets that inﬂuence

the processing of data. This includes acceptance

of computation environments which are going to

process the data. Assuming a distributed environ-

ment, this protection target deﬁnes the computa-

tion nodes the data might be processed on. Be-

sides this, also data extent is of importance, i.e.,

to deﬁne the amount of data that is available at

one time instant for data processing. By limiting

the data extent a limited view of the current data

window is provided.

• Granularity Control: Granularity control cov-

ers targets which play a role in obfuscating the

original data (object) in order to, e.g., prevent

conclusions to the subject the data originates

from with techniques such as anonymization and

pseudonymisation. Other techniques which be-

long to this protection target class are methods

that add some fuzziness to data in order to hide

detailed information on e.g., the current position,

or aggregate a certain amount of data elements be-

fore delivering it to subsequent operations.

These protection targets build the basis for the

comparison of related work in the following section.

The protection targets also deﬁne the main functional-

ities of our security framework which is the basis of

our system implementation, as shown in Section 5.

3 RELATED WORK

This section introduces some well-known security

concepts in the context of data stream processing sys-

tems (DSPS) and provides a comparison according to

the protection targets raised in Section 2.2. A DSPS is

characterized by an asynchronous and distributed exe-

cution of long running queries. This represents a ma-

jor challenge since, e.g., access policies might change

at runtime which require the use of appropriate mea-

sures in order to ensure changed security policies be-

ing enforced. These changes on the one hand should

not inﬂuence ongoing operations as this would affect

currently running queries negatively. On the other

hand the new policies must be enforced as quickly as

possible while avoiding centralized structures as they

constitute a single point of failure. Table 1 provides a

comparison of well-known concepts in this area w.r.t.

the protection targets presented in Section 2.2. These

concepts are described in detail below.

In the year 2005, Secure Borealis (Lindner, W. et

al., 2005)—which extended Borealis (Abadi, D. J. et

al., 2005)—was one of the ﬁrst DSPS which had an in-

tegrated security concept to control data access. The

security concept is based on a general DSPS architec-

ture out of which additional components that enable

access control were derived. The query processing

in Secure Borealis is performed in a distributed fash-

ion. Communication between the single computation

nodes is encrypted to ensure data integrity and to pre-

vent it from being read by third parties. In contrast to

the query processing, the security concept of Secure

Borealis is based on a centralized structure to enforce

NexusDSS:ASystemforSecurityCompliantProcessingofDataStreams

177

Table 1: System comparison w.r.t. the protection targets authentication, access control, process control, the possibility of

controlling the access granularity of context data, and how the protection targets are enforced by the respective system.

System

Authentication Access Control Process Control Granularity Enforcement

Authenticity Liability Integrity Conﬁdentiality Acceptance Extent

Control

Secure

Role Subject Encryption

Centralized

— — "All or Nothing"

Centralized

Borealis

Control Supervisor

ACStream

Subject Subject Predicate Predicate —

Time-based

"All or Nothing" Rewrite Queries

Windows

FENCE Subject Subject Predicate SS+ Operator — — "All or Nothing"

Rewrite Queries/

Security Punct.

NexusDSS

Role Subject

Encryption/

SI Filter

Computation Parametrizable

LoD Filters

Augment Queries/

Certiﬁcates Node Set Windows Security Punct.

the security policies. This circumstance is a potential

bottleneck and represents a possible single point of

failure since all data ﬁrst has to pass through this com-

ponent before it can be forwarded to subsequent op-

erations or the target. This centralized component en-

forces access control and consists of two parts, Object

Level Security (OLS) and Data Level Security (DLS)

components. The OLS component is active before

runtime of queries and the DLS component is active

during query runtime. The OLS component is linked

to a role model that assigns to each subject (e.g., user)

a speciﬁc role that holds its associated access permis-

sions. Subjects hereby are identiﬁed by a username

and password combination. Based on this informa-

tion the OLS component decides whether a subject is

allowed to access objects (e.g., data). All objects a

subject is not allowed to access are hidden. The DLS

component enforces the security policies at query run-

time and consists of ﬁlters that are applied to the ﬁnal

result of the queries to remove objects (data elements)

from the resulting data streams to which the subject

has no access. This in turn means that the access con-

trol enforcement is performed after the ﬁnal data ele-

ments are determined, i.e. the entire query processing

is done. This strategy might discard costly calculated

data resulting in a waste of resources.

The access controls in ACStream (Cao, J. et al.,

2009) can be deﬁned on a data stream level for data

elements and their attributes. ACStream builds upon

Aurora (Abadi, D. J. et al., 2003). Access control re-

strictions are deﬁned by expressions that describe an

explicit assignment of access rights to certain subjects.

To illustrate this, imagine a data stream holding posi-

tions where each data element has an ID-attribute and

a location-attribute. An expression in ACStream can

deﬁne read access for a subject A, if the ID-attribute

has a certain value or the position is within a deﬁned

quadrant. A special feature of the concept is the pos-

sibility to deﬁne temporal constraints. A temporal re-

striction allows access to data elements which are in a

certain time interval. The start time and end time can

be explicitly deﬁned and the time interval is provided

by deﬁning the absolute time interval size and the in-

terval step size. ACStream enforces access control

by rewriting queries. The rewrite process possibly se-

lects security operators instead of regular operators to

carry out the deﬁned access control constraints. Four

different types of security operators are available: Se-

cure View processes an input stream by applying the

access restrictions and returning a view of the data

stream with only data elements meeting the access re-

strictions. Secure Read operators ﬁlter data elements

and remove attributes, Secure Join operators ﬁlter the

output data streams composed of multiple input data

streams, and Secure Aggregate operators control ag-

gregate functions.

FENCE (Nehme, R. V. et al., 2010; Nehme, R. V.

et al., 2008) exploits security punctuations to enforce

access control to data streams. A security punctuation

is a data element within a data stream that deﬁnes the

access policies on the respective data stream. I.e., the

security punctuations are woven into the original data

streams when access restrictions are to be supported.

E.g., if at some point in time the GPS position stream

is restricted to a certain subset of subjects (users) a

corresponding security punctuation is generated and

is woven into the output stream to tell subsequent op-

erations about the changed access restriction setting.

Security punctuations are implemented by two punc-

tuation types. The ﬁrst type is represented by a data

security predicate which controls access to data ele-

ments. The second type is represented by a query se-

curity predicate which controls access to queries. To

control data access two possible approaches are pro-

posed. The ﬁrst approach is a security ﬁlter approach

which provides the use of the so-called security shield

DATA2012-InternationalConferenceonDataTechnologiesandApplications

178

plus operator (SS+ operator). SS+ operators are inte-

grated into the original query and ﬁlter data elements

according to security punctuations. Here ﬁltering for

security punctuations is directly integrated within the

query processing. The second strategy consists of

rewriting the query and relying on existing operator

implementations. To support the ﬁltering of security

punctuations the query predicates must be rewritten

such that—beside the original predicate condition—

the selection operator ﬁlters out data elements which

do not meet the access restrictions deﬁned by the se-

curity punctuations.

3.1 Discussion

The approaches presented propose interesting fea-

tures and give valuable directions. However, the ap-

proaches are not suitable for NexusDS, our open and

distributed data stream processing system (presented

in Section 5.1). A major problem is query rewrite

since a complete rewrite functionality supposes the

query processor to semantically know all operators

available in the system. NexusDS is open and extensi-

ble, thus allowing arbitrary operators. Even only con-

sidering the rewrite of existing predicates might end

up in unpredictable overhead since changes to secu-

rity policies usually condition a restart of the affected

query. The centralized approach in Secure Borealis

guarantees that the security policies are enforced, but

it is not feasible since it represents a potential bottle-

neck limiting the amount of queries that can run in

parallel. Secure Borealis and ACStream in principle

allow to integrate custom operators into the system.

However, no precautions are taken to prevent uncon-

trolled outﬂow of data. Also the data access granular-

ity is not adjustable to domain-speciﬁc needs. Some

data provider generally permits other subjects to use

its objects (context data such as GPS positions). But

eventually a subject may also want to restrict the ac-

cess and processing of the exact objects to a certain

set of subjects and provide the data to others only less

accurate. Finally, the presented approaches only con-

sider access control mechanisms. However, also the

way data is being processed is of importance. E.g.,

certain data should not cross certain administrative

boundaries and thus processing should be limited to

a certain set of processing nodes.

Our augmentation approach is an extension on

a query graph level—denoted as stream processing

graph (SP-graph) throughout this work—and shows

some important advantages compared to the other ap-

proaches presented:

• The semantics of the operators must not be known

in order to adapt the SP-graph to meet the security

restrictions.

• The presented operator model is ﬂexible in the

way it embeds the operator within a box. Thus

also operators which are not designed to work

with the security framework are still usable.

• Our approach allows to deﬁne and integrate arbi-

trary granularity ﬁlters to adapt the data details

and still allow to process them at the cost of some

reduction in data quality.

In the following sections we present the architec-

ture and the characteristics of our security framework

as well as the augmentation technique, that is realized

as part of the NexusDS stream processing system.

4 SECURITY FRAMEWORK

The security framework in NexusDS builds upon the

approaches presented so far and extends them to meet

the special requirements, given by an open, i.e. ex-

tensible, DSPS. In this section we present the secu-

rity framework for security compliant processing of

streamed data. Therefore, ﬁrst basic assumptions for

the security framework are presented. Thereafter, we

detail on the functional capabilities.

4.1 Basic Assumptions

Each subject interacting with the data stream process-

ing system must be assigned a unique identity. This

means that for each user, operator

, and compute node

a unique identity must be deﬁned. There already ex-

ist a variety of solutions, such as the identiﬁcation

by a combination of username and password. Hence,

we assume here subjects can identify themselves by

a valid username and password combination. Further-

more, all services and operators communicate using

a asymmetric key algorithm. This means data is en-

crypted with the public key of the receiver and only

the receiver is able to read the data by using its pri-

vate key. To ensure liability, it is important to track

all actions a certain subject does. Therefore, the cor-

responding log informations must be stored in a re-

stricted and fail-safe area.

Subjects are associated with a set of security poli-

cies which are managed by a policy management com-

ponent. These policies deﬁne access and process con-

ditions for the subjects and the affected objects. Poli-

cies are divided into the types Access Control (AC),

Operator is referred to as synonymous with either a

source operator, an operator or a target operator is used.

NexusDSS:ASystemforSecurityCompliantProcessingofDataStreams

179

Figure 2: Augmentation principle of the secure framework. The original query formulated by applications is translated to

an equivalent one with access control and process control patterns. In a second step the SP-graph is augmented by the

corresponding constraints deﬁned by the access control patterns and the ﬁlters for the process control patterns. Afterwards,

the augmented SP-graph is ready for deployment by an appropriate deployment algorithm.

Process Control (PC), and Granularity Control (GC),

which are detailed in the following. Each policy type

covers a certain aspect of security according to the

protection goals.

First, we start presenting the different security con-

trol patterns. Afterwards, the underlying approach

to augment the SP-graphs with security policies is

shown. This must be done before deploying the SP-

graph to ensure that all security policies are met.

4.2 Security Control Patterns

Access Control. AC-policies ensure integrity of

data and conﬁdentiality in terms of hiding data that

subjects are not allowed to access and provide a mech-

anism to trace actions of subjects accessing data. Ac-

cess control policies enforce an all or nothing seman-

tic. Thus it ensures that objects are only accessed by

subjects having the necessary permissions. Each AC-

policy is uniquely identiﬁed by a policyID. Optionally,

also public keys might be provided to check the signa-

tures of the related operators and services and thus

verify their authenticity. This corresponds to the pro-

cedure of digital signature (Merkle, R. C., 1989) and

is transparently done by the security framework.

Process Control. PC-policies ensure acceptance of

the execution environments and limit the extent of vis-

ible data. PC-policies apply to nodes thus deﬁning a

set of computing nodes which are allowed to execute

certain operators. Furthermore, it is important to limit

the extent of data, i.e. the currently visible window of

data, for the operators. This way it is possible to con-

trol the quality of aggregates, e.g. such as traces of

mobile objects. PC-policies inﬂuence the placement

of operators as part of SP-graph deployment. There-

fore, we developed and presented a ﬂexible operator

placement strategy (Cipriani, N. et al., 2011b) that

allows to restrict the actual placement according to

given restrictions, such as restricting the placement to

certain nodes by at the same time satisfying certain

QoS-conditions.

Granularity Control. GC-policies basically deﬁne

ﬁlters, that apply to operator slots. A slot unam-

biguously deﬁnes an operator-related input or output.

These ﬁlters might be applied before data is send to

subsequent operators or after data has been received

by this operator. This basically depends on the con-

crete conﬁguration of the DSPS.

4.3 Mode of Operation

The principal mode of operation to augment SP-

graphs is depicted in Figure 2. The starting point is

the original SP-graph shown in the left part of Fig-

ure 2. For each subject–object pairing there is a se-

curity policy deﬁned which must be met. This step

is denoted by the integration process which ends up

in an intermediate SP-graph shown in the middle of

Figure 2. The intermediate SP-graph has AC-policies,

PC-policies, and GC-policies attached to their opera-

tors. Each policy type is shown in different shades of

grey. Integration ensures that relevant security poli-

cies are integrated into the SP-graph. The integration

process consists of three steps: AC-integration, PC-

integration, and ﬁnally GC-integration.

In the AC-integration phase, it is ﬁrst checked for

AC-policies relevant to the subject running the appli-

cation (and thus issuing the SP-graph) which deﬁnes

data access to the data involved into the computa-

tion task. This is denoted by AC-A deﬁning whether

the subject is allowed to access Source A. For exam-

ple, an AC-policy might exist for the third-party data

DATA2012-InternationalConferenceonDataTechnologiesandApplications

180

provider from the application scenario depicted in Fig-

ure 1 which forbids him to receiveresults of the linked

data. Thereafter, the AC policies for AC-B (Operator

B) and AC-C (Sink C) are attached to the SP-graph.

Note that AC-policies for all subjects involved must

be considered in this phase. This especially means

that also AC-policies for operators must be integrated

into the SP-graph.

The second phase, PC-integration, consists of

checking whether PC-policies relevant to the subject

(e.g., a user) are deﬁned stating whether the subject

is allowed to execute the operators of the SP-graph.

Analogous to the AC-integration the respective PC-

policies are attached to the SP-graph. This is denoted

by the different PC-policies depicted in Figure 2. Be-

side the user-related PC-policies there might also ex-

ist operator-related PC-policies that limit the execu-

tion of a certain operator to a concrete set of nodes.

Thus, all existing PC-policies for all involved subjects

(including entities such as users, operators, or nodes)

must be attached to the SP-graph.

The GC-integration denotes the ﬁnal phase be-

fore augmentation starts. Analogously to the previ-

ous integration phases in this phase GC-policies are

attached to the SP-graph. In Figure 2 these are dis-

played as GC-A, GC-B, and GC-C. The GC-policies

describe data transformation techniques to manipulate

the original data such that the involved subjects only

access the granularity of data they are allowed to. GC-

policies represent a reﬁnement of the AC-policies and

PC-policies. At this point the intermediate SP-graph

consists of the original SP-graph with security pol-

icy information for all related subjects and objects at-

tached to it.

After the integration process the augmentation

process translates the intermediate SP-graph to an

augmented SP-graph which is shown on the right

side of Figure 2. First, the security policies must be

checked for consistency before continuing. All AC-

policies and PC-policies must be checked. This es-

pecially means that all senders must check whether

the attached receivers of their data are allowed to ac-

cess the data depending of their attached PC-policy.

GC-policies need not to be checked since they are re-

ﬁnements to the AC-policies and PC-policies to ﬁl-

ter data. The AC-policies map to interconnections

between the involved operators which semantically

mean that a certain operator is allowed to access data

from a previous one. The PC-policies condition the

AC-policy interconnections since they inﬂuence the

selection of computation nodes the operator can be

executed on. The GC-policies map to ﬁlters which al-

low a ﬁne-grained access to the single data streams.

The placement of the ﬁlters may be done in three dif-

ferent ways. Referring to Figure 2 they may be placed

on the sender side









1 , on the receiver side









2 or on

both sides









3 . The actual placement depends on the

receivers attached to a certain output stream. In Fig-

ure 2 this is shown for the combination Source A and

Operator B. In a nutshell,









1 is always preferred and

selected to limit the transferred data volume. This

is beneﬁcial if the receiver is a mobile device and

has stringent energy and bandwidth constraints.









is used if there is more than one receiver attached to

the output stream. Thus, the ﬁlters are executed on the

respective receivers. For









3 we have a work sharing

approach which is selected if Source A and Operator

B are both running on a mobile device or if the ﬁltered

output stream is used by multiple attached operators

which themselves need a dedicated ﬁltering.

5 SECURITY FRAMEWORK

INSIGHTS

After the security pattern discussions we now ex-

plain some internals about the characteristics of

our security framework as well as its implementa-

tion within our ﬂexible stream processing framework

called NexusDS. We introduce the architecture of

NexusDSS (NexusDS Secure) with support for secu-

rity compliant processing of data streams. Finally, we

give some details on the operator framework and its

mode of operation.

5.1 NexusDSS – NexusDS Secure

NexusDS (Cipriani, N. et al., 2009) is an open data

stream processing system designed for processing

context data streams with extensive capabilities for

domain-speciﬁc adaptation and support for the protec-

tion goals stated in Section 2.2. NexusDS has been ex-

tended to also enable secure processing of streamed as

well as static context data, resulting in NexusDSS. Its

enhanced organization is depicted in Figure 3. Nexus-

DSS supports the distributed processing of streamed

context data by execution of SP-graphs on a heteroge-

neous network of nodes. NexusDSS allows a ﬂexible

adaptation of the system functionality through the in-

tegration of customized services and operators. The

SP-graph thereby can be annotated by means of re-

strictions to inﬂuence the concrete deployment and ex-

ecution of the SP-graph. For example, deployment re-

strictions limiting the selection of a physical operator

can be deﬁned in this context. This restriction has an

inﬂuence on the deployment process of the SP-graph.

Besides that, also runtime-speciﬁc restrictions can be

NexusDSS:ASystemforSecurityCompliantProcessingofDataStreams

181

Figure 3: Simpliﬁed architecture of the security concept tar-

geted by this paper.

deﬁned. If an operator has speciﬁc parameters, these

parameters can be adjusted to meet certain conditions.

E.g., if we imagine a domain-speciﬁc rendering oper-

ator the resolution is such a parameter that inﬂuences

the runtime behavior of the operator.

The Core Graph Service (CGS) performs the aug-

mentation of the SP-graph, by inserting e.g. missing

ﬁlters according to the access conditions and process

conditions deﬁned. This task is performed by the help

of the Access Control Service (ACS). Beside others,

the ACS is composed of an Identity Administration

Point (IAP) and a Policy Administration Point (PAP).

The IAP is responsible for identifying all subjects

(e.g., users) and objects (e.g., data) available within

the system. The IAP also provides the Public Key In-

frastructure (PKI) to secure the communication of ser-

vices and the data streams between operators against

unauthorized access. Furthermore, certiﬁcates for op-

erators executed in the secure environment are created

with the PKI. Certiﬁcates are needed to validate if op-

erators are known by the security framework and thus

can be executed. All services and operators interact-

ing with the secure environmentneed a corresponding

key and must be certiﬁed via the IAP. The PAP holds

the policies for accessing and processing data.

The fragmentation of the SP-graph into operator

groups is done by our M-TOP approach (Cipriani, N.

et al., 2011b), a multi target operator placement ap-

proach for heterogeneous environments. M-TOP con-

siders annotations at SP-graph level. These annota-

tions in the original work focused on QoS aspects

such as "latency should not exceed a certain value".

However, the annotations might also be of another

kind such as the security policies, annotated at SP-

graph level to adapt deployment decisions. The sin-

gle fragments are distributed across appropriate Op-

erator Execution Service (OES) which are instances

of computing nodes running a certain execution envi-

ronment for the operators of NexusDSS. Each OES-

instance runs on a different computing node. These

services represent the central components of Nexus-

DSS to process data streams. The Monitoring Ser-

Figure 4: Secure operator which is part of the operator

framework supporting security policies. Dashed arrows in-

dicate control interaction with architectural components and

solid arrows indicates data consumed and produced.

vice (MS) collects runtime statistics for the computing

nodes running the operators and provides hints which

OES-instances to use for each fragment. These statis-

tics are exploited to enhance future placement deci-

sions.

5.2 Security Compliant Operator

Framework

Besides the different architectural entities necessary

for security policy management also the actual data

processing facility must support the notion of secu-

rity policies in order to make the security framework

work. For that purpose we adopt a black-box princi-

ple decoupling the deﬁnition of processing logic (in

terms of operators) and security policies. This facil-

itates the development of operators since developers

can concentrate on the actual processing logic. To

support security policies, also the corresponding AC,

PC, and GC policies must be deﬁned. The SP-graph

is augmented by the security policies valid for the sub-

jects and objects involved in the SP-graph deﬁnition.

To create an environment for safe operator execu-

tion (the same holds for source operators and sink op-

erators) the operators are embedded within a box. The

box provides the execution facilities for operators and

includes decoders, ﬁlters, and encoders. Each of these

entities is associated to an ingoing and outgoing slot.

Also the operator itself is contained in the box and

only receives data that has been manipulated comply-

ing to the security policies.

Figure 4 illustrates the embedding of an operator.

First of all, the box, not the operator, receives all in-

coming data streams









1 . The decoders decrypt all

incoming data streams and signals the encoders when

to add punctuations to the outgoing streams









2 . Then

the box applies the necessary ﬁlters to the incoming

data streams before they are transferred to the oper-

ator









3 and forwards the decrypted and ﬁltered data

to the actual operator performing the operations









4 .

When the operator has ﬁnished doing its job the box

receives the processed data from the operator and en-

DATA2012-InternationalConferenceonDataTechnologiesandApplications

182

Figure 5: Secure source which is part of the operator frame-

work supporting security policies. Solid arrows indicate the

source’s produced data streams.

crypts it by the encoders which also check if a punc-

tuation must be inserted before the data element. Fi-

nally, the data is passed to subsequent operators









5 .

For sink operators, the ﬁgure looks similar but with

the difference that for sink operators there are no out-

puts.

The source operator is depicted in Figure 5 and is

also embedded in a box that carries out all security

relevant operations. However, this does not show in-

puts, since the source produces data. An example for

such a source is the GPS sensor from the introductory

example in Section 1. For source operators after the

data generation process (performed by the embedded

Source)the registered ﬁlters (GC-policies) must be ap-

plied to the respective data streams









1 . As with the

operators described beforehand, different ﬁlters might

be deﬁned for each outgoing data stream. After the ﬁl-

tering the data must be encrypted and signed in order

to prevent manipulation from a third party









2 . The

encrypted data is forwarded to subsequent operators









3 .

5.3 Security Characteristics

It is important to note that for encryption and decryp-

tion of the data streams a symmetric key approach is

used. For each SP-graph instantiated and executed a

separate key is generated. This segregates single SP-

graph instances running, maybe, on the same machine.

A time to live (TTL) period is assigned to each gen-

erated key. Before the TTL is reached a new key is

generated and propagated to the affected SP-graph en-

tities. This might result in a slightly higher overhead

but increases security for long running queries. The

concrete TTL assignment for the keys thus strongly

depends on the runtime of SP-graphs.

All operators (including particularly the source op-

erators and sink operators), whose access should be

controlled, are provided by the Operator Repository

Service (ORS). The good behavior of an operator is

veriﬁed by its associated certiﬁcate. Certiﬁcates are

awarded by a separate certiﬁcate authority that attests

through various checks (code check, veriﬁcation or

test) the respective subject to be safe. Thus, although

no guarantee is given for good behavior nevertheless

a useful degree of control is carried out. The certiﬁ-

cate contains a public and a private key (according to

an asymmetric key approach). The subject—the cer-

tiﬁcate belongs to—is now able to create signatures

to ensure its correct provenance. Therefore, a subject-

related hash value (e.g. the hash value of the opera-

tor) is computed and encrypted with the private key.

This signature is validated each time the operator is

going to be executed. Therefore, again the hash value

is computed and encrypted. The result is compared to

the existing signature. If they are equal execution can

be carried out. Otherwise the subject has been manip-

ulated at some point in time prohibiting the execution

of this subject in the secure environment.

6 DEPLOYMENT AND RUNTIME

According to the mode of operation and the security

control patterns discussed in the sections before, in

this section the implementation of our security frame-

work in the NexusDS system is presented. In this re-

gard, we illustrate the augmentation process by start-

ing with an excerpt of the original SP-graph document

in XML notation, as shown in Listing 1. Here, the ex-

ample in Figure 1 is revived as line 5 – 11 describe

the GPS data source (S

1 <!−− namespace deﬁnitions −−>

2 < xmlns:nsas="http://www.nexus.uni−stuttgart.de/1.0/NSAS"

3 xmlns:eas="http://www.nexus.uni−stuttgart.de/1.0/SNSetup

4 Descriptor/EAS" [ . . . ] >

6 <eas:block>

7 <nsas:value>

8 <eas:blockType>source</eas:blockType>

9 <eas:blockID>ResultSetSource0</eas:blockID>

10 <eas:classURI>urn:java:de.uni_stuttgart.nexus.streamFederation.

11 sources.extended.vispipe.resultSetSource.

12 ResultSetSource</eas:classURI>

13 </nsas:value>

14 </eas:block>

16 [ . . . ]

Listing 1: SP-graph excerpt for the source operator

retrieving data from third-party servers.

The ﬁrst part consists of namespace deﬁnitions.

The second part consists of different sections, deﬁn-

ing the SP-graph structure, including operators, links

and so forth. The displayed part is an excerpt of

the operator deﬁnition section. The code deﬁnes a

source operator named ResultSetSource0 (also being

an unique identiﬁer for this source operator). The

NexusDSS:ASystemforSecurityCompliantProcessingofDataStreams

183

Figure 6: Illustration of the augmentation concept presented. The original SP-graph on the left side is augmented by the

corresponding measurements declared in the AC, PC, and GC policies respectively.

class representing this source operator is deﬁned by

the eas:classURI attribute.

6.1 Augmenting SP-graphs with

Security Policies

Figure 6 illustrates the most important aspects of the

augmentation process as implemented in NexusDSS.

The ﬁgure picks up the introductory example scenario

from Figure 1. The three-stage augmentation of SP-

graphs by AC-policies, PC-policies, and GC-policies

is described in the following:

First, the AC-policies are considered. Both Bob

) and Alice (T

) are allowed to access Mike’s pri-

vate data. Furthermore, also the FF visualization (O

)

must be able to access the data of the previous com-

bine step (O

), since also operators represent a sub-

ject which needs access permissions. Therefore, the

operator-related AC-policies attached to the SP-graph

are evaluated. Additionally, the certiﬁcates of all op-

erators are veriﬁed if a signature is provided.

Thereafter, the PC-policies inﬂuence the place-

ment of operators for SP-graph deployment. The set

of nodes that correspond to the PC-policies is deter-

mined by the operator itself via meta-data. E.g. the

visualization operator (O

) might need a Graphic Pro-

cessing Unit (GPU) to properly run. Additionally,

the PC-policies deﬁne a second set of nodes, that

are needed by an operator in order to process its SP-

graph’s predecessors data. The intersection of these

two sets constitute the set of nodes the operator can

be executed on. Furthermore, also the quality of ag-

gregates can be controlled by PC-policies by limiting

the extend of visible data.

Finally, the SP-graph is adapted according to the

GC-policies. Therefore, the GC-policies are evalu-

ated for each operator and the corresponding ﬁlters

are integrated into the SP-graph, e.g., when Alice (T

)

gets Mike’s coarse location.

At this point the augmentation phase is completed

and the deployment phase starts. The deployment

is out of this paper’s scope. We refer to this step

by pointing to our constraint-aware SP-graph deploy-

ment framework called M-TOP (Cipriani, N. et al.,

2011b). Each SP-graph fragment maps to an Oper-

ator Execution Instance which executes the contained

operators, encoder, decoder, and ﬁlters. The aug-

mented and deployed SP-graph runs and generates

data originating from the two source operators A and

B until the sink operators C and D are reached.

Listing 2 resulted from Listing 1 by adding the

three policy types. Two additional sections are inte-

grated into the SP-graph document: policies and ﬁl-

ters. In our example listing, for the source operator

ResultSetSource0 a policy is added. This source op-

erator corresponds to the source operator A from Fig-

ure 2, representing the source for the personal data on

the mobile device. Each policy has a related opera-

tor, denoted by the blockID attribute. Furthermore,

each policy also has an unique policyID attribute to

uniquely identify the corresponding policy that ap-

plies here. The policy for the source operator A de-

ﬁnes a ﬁlter. This ﬁlter applies to the output slotID

0 of the given blockID. ﬁlterS deﬁnes the signature

of this ﬁlter to be sure the executed ﬁlter ﬁlterURI is

the real one. The attribute user represents the user re-

questing the SP-graph execution. Finally, the attribute

policyID correlated this ﬁlter to the policy deﬁning it.

DATA2012-InternationalConferenceonDataTechnologiesandApplications

184

1 <!−− namespace deﬁnitions −−>

2 < xmlns:nsas="http://www.nexus.uni−stuttgart.de/1.0/NSAS"

3 xmlns:eas="http://www.nexus.uni−stuttgart.de/1.0/

SNSetupDescriptor/EAS" [ . . . ] >

5 <eas:block>

6 <nsas:value>

7 <eas:blockType>source</eas:blockType>

8 <eas:blockID>ResultSetSource0</eas:blockID>

9 <eas:classURI>urn:java:de.uni_stuttgart.nexus.streamFederation.

sources.extended.vispipe.resultSetSource.ResultSetSource</

eas:classURI>

10 </nsas:value>

11 </eas:block>

13 <eas:policy>

14 <nsas:value>

15 <eas:blockID>ResultSetSource0</eas:blockID>

16 <eas:policyID>aff337fe−abcf−4077−bb3c−743f4562b424</

eas:policyID>

17 </nsas:value>

18 </eas:policy>

20 <eas:ﬁlter>

21 <nsas:value>

22 <eas:blockID>ResultSetSource0</eas:blockID>

23 <eas:slotID>0</eas:slotID>

24 <eas:ﬁlterS>kqiR+IjnnRwui4JmPA83zG1hRmxQj [...]

Qwa0Pv02gICiJQgxv382Pw==</eas:ﬁlterS>

25 <eas:ﬁlterURI>urn:java:de.uni_stuttgart.nexus.streamFederation.ﬁlters.

secure.resultSet.ResultSetFilter</eas:ﬁlterURI>

26 <eas:role>poweruser</eas:role>

27 <eas:policyID>aff337fe−abcf−4077−bb3c−743f4562b424</

eas:policyID>

28 </nsas:value>

29 </eas:ﬁlter>

31 [ . . . ]

Listing 2: SP-graph excerpt for the source operator

retrieving data from third-party servers.

7 CONCLUSIONS

With the rapidly increasing number of mobile phones

equipped with GPS sensors and mobile Internet con-

nections, the use of data stream processing is increas-

ing in many application areas. This work presents a

security framework dealing with the requirements of

modern applications relying on the data stream pro-

cessing paradigm. The security framework proposes

different security control patterns, i.e. AC, PC, and

GC, which can be assigned to different system en-

tities. The deﬁned security control patterns are ex-

ploited to ensure a safe processing of sensible data.

This is achieved by augmenting SP-graphs with AC,

PC, and GC policies. By the proposed security frame-

work it is possible to optimally adjust the density of

information that is going to be processed as well as to

limit access to data.

As future work issues we want to further extend

our framework by additional mechanisms when se-

curity policies change during the execution of SP-

graphs. A current problem is that certain security pol-

icy changes—such as the addition of a ﬁlter or the

change of a PC policy—mean to stop current execu-

tion and restart a new modiﬁed SP-graph instance.

REFERENCES

Abadi, D. J. et al. (2003). Aurora: a new model and architec-

ture for data stream management. The VLDB Journal,

12:120–139.

Abadi, D. J. et al. (2005). The Design of the Borealis Stream

Processing Engine. In Second Biennial Conference on

Innovative Data Systems Research (CIDR 2005).

Anderson, R. J. (2008). Security Engineering: A Guide to

Building Dependable Distributed Systems. Wiley Pub-

lishing.

Cao, J. et al. (2009). ACStream: Enforcing Access Control

over Data Streams. In Data Engineering, 2009. ICDE

’09. IEEE 25th International Conference on.

Cipriani, N. et al. (2009). NexusDS: A Flexible and Extensi-

ble Middleware for Distributed Stream Processing. In

Proceedings of the 13th International Symposium on

Database Engineering & Applications.

Cipriani, N. et al. (2011a). Design Considerations of a Flex-

ible Data Stream Processing Middleware. In Proceed-

ings of the 15th Conference on Advances in Databases

and Information Systems.

Cipriani, N. et al. (2011b). M-TOP: Multi-Target Opera-

tor Placement of Query Graphs for Data Streams. In

IDEAS ’11: Proceedings of the 2008 International

Symposium on Database Engineering & Applications.

Lindner, W. et al. (2005). Towards a secure data stream

management system. In in TEAA 2005.

Merkle, R. C. (1989). A certiﬁed digital signature. In Pro-

ceedings on Advances in cryptology.

Nehme, R. V. et al. (2008). A Security Punctuation Frame-

work for Enforcing Access Control on Streaming Data.

In Data Engineering, 2008. ICDE 2008. IEEE 24th In-

ternational Conference on.

Nehme, R. V. et al. (2010). FENCE: Continuous access

control enforcement in dynamic data stream environ-

ments. In Data Engineering (ICDE), 2010 IEEE 26th

International Conference on.

Nicklas, D. and Mitschang, B. (2004). On building location

aware applications using an open platform based on

the NEXUS augmented world model. Software and

System Modeling, 3(4):303–313.

NexusDSS:ASystemforSecurityCompliantProcessingofDataStreams

185