INTRUSION TOLERANCE AS A SERVICE

A SLA-based Solution

Massimo Ficco and Massimiliano Rak

Dipartimento di Ingegneria dell’Informazione, Seconda Universit

a di Napoli, Aversa, Italy

Keywords:

Intrusion Tolerance, Service Level Agreements, SLA, Cloud, Denial of Services.

Abstract:

Among the incredible number of challenges in Cloud Computing two of them are considered of great rele-

vance: Service Level Agreement management and Security management. In this paper we will try to show

how it is possible, using a cloud-oriented API derived from the mOSAIC project, to build up an SLA-oriented

cloud application which enables the delivery of security solutions as a service. We will focus on intrusion tol-

erance solutions, i.e., systems which grant that a system maintain a (limited) availability even when a security

attack take place.

1 INTRODUCTION

Cloud Computing is an emerging reality. Following

the NIST deﬁnition, Cloud computing is “a model for

enabling ubiquitous, convenient, on-demand network

access to a shared pool of conﬁgurable computing

resources (e.g., networks, servers, storage, applica-

tions, and services) that can be rapidly provisioned

and released with minimal management effort or Ser-

vice Provider interactions [...]”. The key idea is the

delegation to the network of every kind of resources,

that can be obtained and managed on a self-service

based approach and with the perception of inﬁnite

amount of available resources.

At the state of art, the most diffused actual solu-

tions focus on the so-called Infrastructure as a Ser-

vice (IaaS) Providers, which mainly offer virtual ma-

chines (and storage systems) following the as a ser-

vice paradigm. This kind of services are offered

now by both big players, like Amazon, IBM and Mi-

crosoft, and by little providers, like Rackspace and

GoGrid. Such resources are offered to Cloud users on

the basis of a pay-per-use model: the resources are

payed only for their effective usage (CPU active time,

amount of data stored, etc).

In such environment, a well-known diffused prob-

lem is related to the security features that can be of-

fered (and granted) by such Cloud Providers: are the

offered virtual machines protected against security at-

tacks (like Denial of Services)? Due to the pay-per-

use business model these problems are really relevant:

who pays for resources consumption due to a security

attack?

In order to deﬁne a clear agreement between re-

sponsibility assignment, a lot of interests are assum-

ing the Service Level Agreements (SLAs). An SLA is

an agreement between a Service Provider and a cus-

tomer, that describes the Service, the service level tar-

gets, and speciﬁes the responsibilities of the Provider

and the customer. SLAs aim at offering a simple and

clear way to build up an agreement between the ﬁ-

nal users and the Service Providers in order to es-

tablish what is effectively granted in terms of qual-

ity. From user point of view, a SLA is a contract that

grants him/she about what he/she will effectively ob-

tain from the service. From Cloud Providers point of

view, SLAs are a way to have a clear and formal def-

inition of the requirements that the application must

respect. SLAs are a way to formalize the agreements,

but no stable solutions exists today in order to monitor

and enforce the respect of such agreements, even if a

lot of research effort is spent today in such problems.

In this paper we will try to show how it is pos-

sible, using a Cloud-oriented API derived from the

mOSAIC project (mOSAIC Project, 2010; van Sin-

deren F. Leymann et al., 2011; Massimiliano Rak,

2011), to build up a SLA-oriented Cloud application,

which enables an IaaS Cloud Provider to offer secu-

rity service customized on user needing on the top of

the resources delivered.

The problem on which we focus is to offer, in a

transparent way, a service that is ables to offer the typ-

ical IaaS services (mainly Virtual Machines delivery,

starting and stopping) enriching them with ad-hoc so-

375

Rak M. and Ficco M..

INTRUSION TOLERANCE AS A SERVICE - A SLA-based Solution.

DOI: 10.5220/0003941003750384

In Proceedings of the 2nd International Conference on Cloud Computing and Services Science (CLOSER-2012), pages 375-384

ISBN: 978-989-8565-05-1

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

lutions for protecting the delivered resources against a

set of security attacks. The goal is to enable the Cloud

user to negotiate with the provider the level of secu-

rity offered, so that the service offered (for example

a Web server) will be protected against a given set of

attacks. The user pays for the additional security ser-

vice, but he/she is granted that the services is tolerant

to a given set of Denial of Services attacks (i.e., con-

tinues to work even under attack) and the additional

load generated is not charged.

The remainder of this paper is organized as fol-

lows: next section is dedicated to clarify the approach

adopted to solve the problem. Section 3 is dedicated

to intrusion tolerance technologies and offers a basis

for the description of the Intrusion Tolerant solution

adopted in our proposal, described in Section 4. Sec-

tion 5 describes the technology we adopted in order

to build up the solution, while Section 6 describes our

Cloud application that enables the offering of SLA ne-

gotiation and enforcement. Section 7 describes the

ﬁnal integrated solution and the offered SLAs. Sec-

tion 8 summarizes the related work and Section 9 con-

cludes the paper.

2 THE PROPOSED APPROACH

The problem we focus on this paper is the design and

implementation of a system able to offer Intrusion

Tolerance (IT) solutions as a Service. The key idea

is that the approach of offering security solutions as a

Service can be better obtained following a SLA-based

approach: ﬁnal user invokes the services always in the

same way, but, due to the integration of the system

together with an SLA-based application, the services

offered can be enriched with security mechanisms.

In order to better illustrate the problem, Figure 1

shows the approach we proposed: the Cloud Provider

offers typical Infrastructure as a Service (IaaS) func-

tionalities, offering virtual machines following a well

known interface. In the example presented, the Cloud

Provider offers, among the list of available images,

a set of images offering pre-conﬁgured Web Servers:

the user, invoking the provider’s services, is ables to

startup a Web Server, that he is able to customize

and/or enrich with given functionalities.

Thanks to the adoption of the mOSAIC Frame-

work (see section 5), the provider is ables to offer

more complex services on the top of the virtual ma-

chines. In the proposed example, the provider offers

SLA-based application, which enables the ﬁnal user

to negotiate, trough the WS-Agreement standard de-

scription, the quality of services delivered.

In this paper, we will show that it is possible even

Figure 1: Intrusion Tolerance as a Service with SLA-based

approach.

to negotiate security-oriented parameters. In our case

study, we will show that the delivered machine will

be enriched, after the negotiation, with IT techniques,

that grant the user against some of possible Denial of

Services (DoS) attacks.

As shown in Figure 1, the user invokes exactly the

same service, but, due to the SLA negotiation process,

instead of the binding to the unprotected Web Server,

he will receive the binding to a VM, which is enriched

with IT functionalities. Service invocation does not

change and from ﬁnal user, obtaining a standard ma-

chine or a protected one is completely transparent.

Moreover, trough the SLA-based application, it is

possible to help Cloud Provider and user to agree on

the details of the security granted, identifying the fea-

tures that should be offered.

3 INTRUSION TOLERANCE:

DEFINITIONS, METHODS AND

TECHNIQUES

Intrusion Tolerance is the ability of a system to con-

tinue providing (possibly degraded) adequate service,

despite the presence of malicious faults, i.e., delib-

erate attacks on the security of the system by both in-

siders and outsiders. In order to enforce the IT several

techniques can be used.

Replication is the technique most commonly used

to perform IT. It consists to use more replicas of the

same component and use speciﬁc voting algorithms,

which are used to resolve any difference in redundant

responses and to arrive at a consensus result based on

the responses of perceived non-compromised compo-

nents in the system. It has two complementary goals:

masking of intrusions, thus tolerating them, and pro-

viding integrity of the data. Examples of algorithms

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

376

Figure 2: Intrusion tolerant architecture.

are Byzantine replication algorithms (P. Kouznetsov

and Druschel, 2006).

Using a rejuvenation approach, critical compo-

nents are periodically rejuvenated to remove the ef-

fects of malicious and intelligent attackers that ﬁnd

ways to compromise them. Example of rejuvena-

tion procedures could aim at loading a clean version

of the application or change the cryptographic keys

(Marsh and Schneider, 2004; N. F. Neves and Veris-

simo, 2006).

Redundancy is an approach different from replica-

tion, which is just one type of redundancy. Replicated

components are pure replicas of each other. If the

attacker has found a technique to subvert one com-

ponent and all are pure replicas, it is likely that all

components are likewise vulnerable. To combat this,

another common technique used is ‘diversity’. Di-

versity is the property such that the redundant com-

ponents should be substantially different in one or

more aspects, from hardware diversity and operat-

ing system diversity, to software implementation di-

versity. Therefore, through the use of diversity, the

probability of a replica being compromised is inde-

pendent of the occurrence of intrusion in other repli-

cas (A. K. Caglayan and Eckhardt, 1989; R. Mista and

Medidi, 2002).

Indirection allows designers to insert protection

barriers and fault logic between clients and servers/-

components that provide services. Since the indirec-

tion is hidden outside of the black box system, clients

see only what looks like a COTS server. There are at

least four main types of indirection used by IT sys-

tems: proxies, wrappers, virtualizations, and sand-

boxes.

Several previous techniques, commonly named

proactive, aim at preventing system components be-

ing compromised. Reactive techniques aim at mitigat-

ing and reacting to intrusion. For example, they aim

at minimizing stolen resources and disable inappro-

priate information ﬂows (e.g., through roll back and

roll forward) to react/mitigate intrusion impact on the

system. Intrusion detection and correlation mecha-

nisms can be used to detect intrusion and identify the

speciﬁc recovery action (Ficco, 2010) . For example,

in replication, it could force the recovery of a replica

that is detected or suspected of being compromised.

Reconﬁguration can be proactive or reactive and

can help in prevention, elimination as well as toler-

ance. A wide variety of reconﬁguration strategies are

employed (D. Heimbigner. and Wolf, 2002). A chal-

lenge in devising the reconﬁguration mechanisms is

to protect them from being (mis)used by the attacker.

It is important that the reconﬁguration process be not

very predictable by the attacker. Therefore, a major

challenge is to make them unpredictable and resilient

to oscillations due to transient and malicious effects

that may lead to reconﬁgurations that drive the sys-

tem to an inconsistent state.

4 AN INTRUSION TOLERANT

SOLUTION

As case study, in this paper we adopt a Intrusion Tol-

erant reactive technique. The proposed IT architec-

ture is composed of two subsystems, with distinct

properties (Figure 2).

The ﬁrst subsystem is the Application VM that

hosts the application to protect, whereas the second

subsystem, named ITmOS, is the VM that hosts the

IT mechanisms. The two subsystems are connected

through a secure channel isolated from other connec-

tions.

The interaction of the application with the outside

world is done only through the network, using a Proxy

(based on running Squid Web proxy (Squ, )) hosted

on the ITmOS VM.

A VM Monitor monitors the Application VM. It

is a Java-based component based on Ganglia-Gmond,

which is a real-time monitoring system (Gan, ). It is

used to collect system resources consumption (includ-

ing CPU, memory, disk).

INTRUSIONTOLERANCEASASERVICE-ASLA-basedSolution

377

Figure 3: CPU consumption with different message fre-

quencies.

An Intrusion Detector module collects data from

the Proxy and alerts the Decision Engine component

whether an anomalous behavior is observed. The De-

cision Engine is a centralized engine, which receives

and correlates security data. It determines whether

the monitored data are malicious behaviors, and what

the effects on the monitored subsystem. It is respon-

sible to identify the best reaction to take, in order

to mitigate the attack effects on the target applica-

tion. In particular, it analyzes the received data and

performs the reactions by the Proxy in response to

the attacks, ﬁltering messages to the guest system

as needed. The Intrusion Detector is connected to a

Cloud Agent, which provide the necessary facilities to

interact with the VM Monitor through a secure com-

munication channel.

4.1 An Example of Intrusion Tolerance

Approach for Denial of Service

Attacks to Web Server

In our previous work (Ficco and Rak, 2011), we

present an IT approach for Denial of Service attacks

to Web Server, which exploits XML vulnerabilities.

We focus on the Deeply-Nested XML DoS attacks

(XDoS), which exhaust the computational resources

of the target system by forcing the XML parser within

the server application to process numerous deeply-

nested tags. In particular, the attack consists of in-

serting of a large number of nested XML tags in the

Web messages.

Figure 3 shows the CPU consumption depending

on the number of nested XML tags and the frequency

with which the malicious Web messages are injected.

We perform different attack scenarios. Each scenario

consists of a sequence of messages injected with a

ﬁxed frequency and a ﬁxed number of nested tags.

An attack scenario takes about 30 seconds. In par-

ticular, Figure 3 represents the average value of CPU

for the different scenarios, which are performed using

different message frequencies and with a number of

tags nested to different depths. The experiment shows

that it is sufﬁcient to inject messages with about 3000

nested tags every 100 ms, to make unavailable the Ap-

plication VM (i.e., to exhaust the CPU resource).

4.1.1 XDoS Attack Reaction

The implemented solution allows to trigger a speciﬁc

reaction to mitigate the effects of the XDOS attacks.

In order to detect the attack, an anomaly-based

monitoring approach is adopted that assigns a weight

to the anomalous events detected. The weight reﬂects

the anomaly levels with regard to an established pro-

ﬁle. If this weight does not exceed a threshold esti-

mated during a training phase, the event is discarded

(i.e., it is considered as a normal behavior), otherwise

an alarm is triggered and a reaction is activated.

In the proposed architecture (Fig. 2) the Deci-

sion Engine correlates the events generated by both

the VM Monitor hosted on the Application VM and

the Intrusion Detector.

As presented in (Ficco and Rak, 2011), on the

occurrence of an excessive CPU consumption, if an

anomalous number of nested XML tags with respect

to a normal proﬁle is monitored, a reaction in trig-

gered. In particular, the Decision Engine alerts the

Proxy, which ﬁlters each Web request that contains a

number of nested tags greater than a ﬁxed threshold.

The purpose of this action is to reduce the CPU load

on the Application VM, thus reducing the period in

which the Web Server is unavailable.

In order to evaluate the proposed solution, a

workload based on TPC Benchmark W (TPC-W) is

adopted (TPC, ). It is a transactional Web bench-

mark. The workload is performed in a controlled

environment that simulates the activities of a busi-

ness oriented transactional Web Server. It simulates

the execution of multiple transaction types that span

a breadth of complexity. Multiple Web interactions

are used to simulate the activity of a retail store, and

each interaction is subjects to a response time con-

straint. The performance metric reported by TPC-W

is the number of Web interactions processed per sec-

ond (WIPS). It is used to simulated stress load and to

assess the effectiveness of the proposed solution by

the WIPS measurements.

An example of recovery effect is shown in Figure

4. It represents the WIPS and CPU variations with re-

spect to the time, during an interval time of tree min-

utes.

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

378

The experiment consists of tree temporal win-

dows. During the ﬁrst two windows, the Decision

Engine is disabled. In particular, the ﬁrst 60 seconds

show the values of the WIPS and the CPU load in

absence of the attack. The second 60 seconds show

the attack effects. We injected malicious messages

with 3000 nested tags every 200 ms. During this pe-

riod, the CPU load is about 100% and the number of

TPC-W interactions processed is very low. Finally,

during the last 60 seconds, the Decision Engine is en-

abled. When the Decision Engine detects the condi-

tion of a reaction (i.e., the attack is in progress and

the CPU consumption exceed the 90%), it alerts the

Proxy, which ﬁlters all the suspicious messages. The

reaction is applied until the CPU load falls below the

90%.

Figure 4: WIPS evaluation during a intrusion recovery pro-

cess.

5 mOSAIC: DEVELOPMENT OF

DISTRIBUTED CLOUD

APPLICATIONS

mOSAIC aims at offering a simple way to develop

Cloud application. The target user for the mOSAIC

solution is a developer (mOSAIC User). In mOSAIC

a Cloud application is modeled as a collection of com-

ponents that are able to communicate each other and

consume Cloud resources (i.e., resources offered as a

service from a Cloud Provider). Cloud applications

often are offered in the form of So ftwareasaService

and can be accessed from users different from the

mOSAIC developer. Users different from the mO-

SAIC User, which uses a Cloud application are de-

ﬁned FinalUsers.

The mOSAIC solution is a framework composed

of three independent components: Platform, Cloud

Agency and Semantic Engine. The ﬁrst one (mOSAIC

Platform) enables the execution of application devel-

oped using mOSAIC API; the second one (Cloud

Agency) acts as a provisioning system, brokering re-

sources from a federation of Cloud Providers; the last

one (Semantic Engine) offers solution for reasoning

on the resources needing and the application needing.

For the needing of this paper, we need concepts re-

lated to Platform and Cloud Agency, while Semantic

Engine will not be focused in this context.

mOSAIC solution can be adopted in three dif-

ferent scenarios: (i) when a developer wants to de-

velop an application, which runs independently from

the Cloud Providers, (ii) when an IaaS Provider aims

at offering enhanced services, and (ii) when a single

user (like a scientist) is interested to start his own ap-

plication in the Cloud for computational purposes. In

this paper, we will focus on the second scenario (the

one dedicated to a Provider).

The approach adopted is pretty simple in this case:

the Provider starts up a set of its resources dedicated

to management of the Cloud application and hosting

the mOSAIC Platform. On the top of the Platform,

the Provider runs its own mOSAIC Application that

directly interacts with the IaaS service offered in order

to grant added value services.

5.1 Programming with mOSAIC

A mOSAIC Application is deﬁned as a collection of

interconnected mOSAIC Components. Components

may be offered by the mOSAIC Platform (i) as COTS

(Commercial off-the Shelf) solutions, i.e., common

technologies embed in a mOSAIC component, (ii)

as Core Components, i.e., tools offered by mOSAIC

Platform in order to perform predeﬁned operations,

or (iii) new components developed using mOSAIC

API. In last case, a component is a Cloudlet run-

ning in a Cloudlet Container (van Sinderen F. Ley-

mann et al., 2011). mOSAIC Components are inter-

connected trough communication resources, queues

or other communication system (i.e., socket, web ser-

vices, etc.). The mOSAIC Platform offers some queu-

ing system (rabbitmq and zeroMQ) as COTS compo-

nents, in order to enable component communications.

Moreover, mOSAIC Platform offers some Core Com-

ponents in order to help Cloud application to offer

their functionalities as a service, like an HTTP gate-

way, which accepts HTTP requests and forward them

on application queues. mOSAIC Components run on

a dedicated virtual machines, named mOS (mOSAIC

INTRUSIONTOLERANCEASASERVICE-ASLA-basedSolution

379

Operating System), which is based a very small Linux

distribution. The mOS is enriched with a special mO-

SAIC Component, the Platform Manager, which en-

able to manage set of virtual machines hosting the

mOS as a virtual cluster on which the mOSAIC Com-

ponents are independently managed. It is possible

to increase and decrease the number of virtual ma-

chines dedicated to the mOSAIC Application, which

will scale in and out automatically.

The Cloud application is described as a whole in

a ﬁle named Application Descriptor, which lists all

the components and the Cloud resources needed to

enable their communications. A mOSAIC developer

has the role of both develop new components and

write Application Descriptors, which collect them to-

gether. mOSAIC API, actually based on Java, en-

ables the development of new components (in the

form of Cloudlets), which self-scale on the above de-

scribed Platform and consume every kind of Cloud

resources like queues, NO-SQL storage system (like

KV store and columnar databases), independently

from the technologies and API they adopt, trough a

wrapping system.

5.2 mOSAIC Offering for SLA-based

Applications

mOSAIC offers a set of features dedicated to SLA

management. In (Massimiliano Rak, 2011), we pro-

posed a ﬁrst outline of the full architecture, while in

(Rak et al., 2011) we proposed a similar case study in

which we presented an application that offers security

access conﬁgurations to a GRID environment in terms

of SLA. The components offered in the SLA architec-

ture should help the application developer to imple-

ment an SLA-based architecture. The main modules

offered are :

• SLA Agreement Management: This module con-

tains all the Cloudlets and components that man-

age the SLA documents and their formal manage-

ment, i.e., negotiation protocols, auditing, and so

on.

• SLA Monitoring System: This module contains all

the Cloudlets and components needed to detect

the warning conditions and generates alerts about

the difﬁculty to fulﬁll the agreements. It should

address both resource and application monitoring.

It is connected with the Cloud Agency.

• SLA Enforcement System: This module includes

all the Cloudlets and components needed to man-

age the elasticity of the application, and modules

that are in charge of making decisions in order to

grant the respect of the acquired needed to fulﬁll

the agreements.

In this context, we are interested mainly in the

SLA management module, which offers the func-

tionality to automatize the SLA negotiation process.

The main component dedicated to SLA management

is the SLAgw, offered by the mOSAIC framework

as a standalone component. SLAgw component of-

fers an incredibly simple way to interact with ﬁnal

users in a SLA-based manner: they will negotiate the

agreement. The SLAgw sends out a message to the

mOSAIC Application each time a new agreement re-

quests take place. At the state of art, we support the

asynchronous WS-Agreement negotiation (i.e., the

users agreement requests always receive a wait reply,

then it is up to the user to query for the agreement

status and obtain an accept or a refuse). In future

work, we will support even the synchronous negoti-

ations and more evolved protocols.

The SLAgw component does not assume deci-

sions about the submitted SLA, it just forward them

internally to a mOSAIC Application and store the

SLA in a shared KV store.

6 SLA-BASED CLOUD

APPLICATION FOR IT

MANAGEMENT

In our case study, the Provider aims at developing

a Cloud application that offers SLA negotiation fea-

tures, following the SLA model proposed in mO-

SAIC, in order to offer enriched services to Cloud

users. The negotiation has the effect of enabling the

IT techniques adopted.

The application we aims at offering offers mainly

two different use cases:

• SLA Negotiation: that enables an Final User to ne-

gotiate a SLA identifying the security of the ser-

vices offered. Trough the negotiation process, it

is possible to enforce the security mechanisms de-

scribed in Section 4.

• VM Delivery: that just delivers a virtual machine

to the Final User. In this case, the application just

forwards the request to the underlying IaaS infras-

tructure in order to offer the service to the Final

User. Note that, depending on the SLA negoti-

ated, the application may start different number

of VMs for the same request done from the Final

User.

The SLA negotiation follows the SLA model pro-

posed in mOSAIC, brieﬂy described in Section 5 and

based on SLAgw, while the VM delivery is done sim-

ply forwarding the requests to the underlying IaaS

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

380

Figure 5: Overal architecture of the IT SLA-based services.

Provider in the case of non-protected requests, instead

performing a set of operation when an IT system is re-

quired.

Figure 5 describes the global architecture of the

proposed solution. As shown, the IaaS offers its ser-

vices as usual to its own Final Users (the upper re-

quests), but it is possible to offer the same services

even trough the newly developed mOSAIC Applica-

tion. In the latter case, the requests may have or not

the same format of the underlying provider. In the ex-

ample, the requests are done with a very simple Rest-

ful interfaces trough a JSON request attached to an

HTTP POST invocation. When the request arrives, it

is interpreted by the mOSAIC Application that per-

forms the local requests in order to start all the VMs

needed (i.e., the standard Web Server or the solution

presented in Section 4, which includes both the Web

Server and the IT Proxy connected trough a dedicated

virtual network).

The mOSAIC Application is fully described in

Figure 6, where all the components involved in the

SLA negotiation and SLA enforcement are proposed.

Figure 6 shows the main components of the mOSAIC

Application, following the architecture proposed in

(Massimiliano Rak, 2011). Our mOSAIC Applica-

tion consists of four components:

• SLAgw that receives the WS-Agreement, stores it

in the local storage (signing it in state pending),

and forwards it to the decision Cloudlet;

• SecDecision that evaluates if the SLA is accept-

able (i.e., if the user can assume that role and ac-

cess that container). The Cloudlet has the role of

updating the SLA status (as an example, signing it

Figure 6: SLA-based mOSAIC Application.

as accepted or refused) and in case it is accepted,

it forwards a request to SecConﬁgurer;

• SecConﬁgurer receives the requests from SecDe-

cision and update a KV store in which there is

signed the security level agreed for each user. In

this paper, we just assume hat we offer two secu-

rity level for a Web Server: unprotected and pro-

tected. However, it is possible to enrich the offer

with a lot of different solutions

• Request Interpreter receives the user requests,

evaluate the requests, extracting the service re-

quested, checks the KV store in order to identify

the SLA agreed, and then performs the request to

the local IaaS Provider. In case of unprotected re-

quest, just start the Application VM, while in case

of protected Web Servers, it start both the Appli-

cation VM and then the ITmOS VM, writing in

the proxy conﬁguration ﬁle the right conﬁgura-

tion informations. In both cases returns an array

of IP and port pairs, but in the ﬁrst case the array

contains just one value, in the second case two of

them.

INTRUSIONTOLERANCEASASERVICE-ASLA-basedSolution

381

7 OFFERING INTRUSION

TOLERANCE TROUGH SLA

The above proposed application enables to negoti-

ate with application users a SLA, described in WS-

Agreement, in order to adapt the requests for deliver-

ing VMs from the underlying IaaS Provider. In this

section, we focus on how such requests can be de-

scribed in Ws-Agreement and how to enforce the SLA

in a VM request invocation.

It is important to point out that a SLA implies that

the offering are granted, and if not respected some

penalties are applied to the peer of the agreement.

Following the above approach, we need to understand

what the proposed IT system is able to grant. More-

over, the solution agreed should be veriﬁable from

the Final User, in order to check the effective respect

of the SLA. An additional consideration is important

now, the adoption of SLA in offering services as the

side effect of imposing to Cloud Provider to clear

identify the advantages offered in a measured way.

Identifying the real grants offered in the context

of security is a very hard task, being at the state of

art very few available solutions able to quantitatively

measure the security level of a system in an incon-

testable way.

The approach we propose to such a problem is

pretty simple and can be easily moved to many dif-

ferent contexts: we identify the set of security threats

we are able to face and try to model with quantita-

tive parameters such threats. As an example, we can

model a ﬂood attack as a possible threat and model it

in terms of the number of ﬂooding messages received

by the system. Our SLA will be built starting from

the list of all the threats we are able to face and the

level will be based on the quantitative parameters we

have identiﬁed to model the security threat. Such an

approach can be adopted in each case, in which secu-

rity threats can be modeled in terms of an attack, and

it is possible to build up a quantitative model of such

an attack.

For simplicity’s sake, in the following we will fo-

cus on a single attack against which our IT system

work, in order to clarify the approach with a simple

example. Being our proxy able to face a larger set of

attacks, the real SLA is much more complex than the

one proposed here.

The attack we focus, as described in Section 4 is

an XDoS attack based on tag nesting. This attack

founds on the simple idea that XML validators will

be heavily CPU intensive when they have to check a

(valid) XML ﬁle with a very high number of nested

tags.

When an attack takes place the CPU consump-

tion increases even if only few malicious messages

are processed by the Web Server. Our solution de-

tects the attack using the following set of information:

(< MeanNumbero f tags >, < MeanCPUUsage >, <

TimeRange >). The detection takes place on inter-

val of time of duration < TimeRange >, in such time

interval we evaluate if both mean CPU consumption

and mean number of tags are over ﬁxed thresholds.

Such a model to detect the attack, that we call Sim-

pleThreshold, can be model by using the following

simple parameters:

• CPU Threshold

• TAG Threshold

• Time Range

Our IT model is able to grant that the Web

Server is protected against an Deeply-Nested XML

DoS attacks, detectable with a SimpleThreshold

technique with parameters <CPU Threshold>, <TAG

Threshold>, <Time Range>. Our solution grants

that if such an attack takes place, there will be no

additional CPU usage on the Web Server. The user

knows exactly the conditions under which its own

Web Server is protected and he is able to adapt the

IT Proxy parameters.

It is important to point out that, such an SLA is

correct, but very hard to manage for ﬁnal users, on

the other side target users are Web Server adminis-

trator with great experience. In future work, we will

offer tools that helps in managing such information in

a more easy way, using semantic technologies.

In order to offer the SLA in a formal way, we

translate such informations in a WS-Agreement tem-

plate, that can be ﬁlled by users in order to negotiate

the parameters. We deﬁned a simple schema for man-

agement of our security tags, which enable to list the

attacks against which the system is protected. The

code listed in 1 shown the example of guarantee term

for the Web Server’s service (that is described in an

OCCI compliant way).

Listing 1: XMLDos Attack WS-Agreement.

< ws : Ser v ic e Des c rip t ion T er m ws:Na m e

=" WEB S E R V E R R EQ U ES T "

ws : S erv i c eNa m e =" SET V A RI AB LE " >

< Co mp ut e >

< arc h i tec t u re > x86 </

arc h itec t u re >

< cpuCo r e s >4 </ cpu C o r es >

[...]

< ti t l e > WebSe r v er </

title >

</ Com p u t e >

</ w s :S e rvi c eDe s cr i p ti o nTe r m >

[...]

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

382

< ws a g :G u a ran t eeT e rm w s a g :Name = " ITS

" w s ag: S erv i c eSc o pe = " W EB

SERVER REQ U E S T " Ob l i gate d : "

Prov i d e r " >

< ws a g: S e rv i ceL e ve l O bj e cti v es >

< wsa g :KP I T arg e t >

< wsa g : KPI N a me > XML Do S < /

wsa g :KPI N a me >

< wsa g : Targ e t >

< its a g :At t a ck na m e = " Nested

TA G " / >

< it a s g:D e t ect i on nam e = "

Si m p leT h resh o ld " >

< it s a g:P a r ame t er nam e = "

CPU T hres h o ld " v a lue = "

90 " unit =" p e rcen t a ge "

/ >

< it s a g:P a r ame t er nam e = "

TAG T hres h o ld " v a lue = "

20 " unit =" n u m b e r "/ >

< it s a g:P a r ame t er nam e = "

Tim e R a nge " v a lue = "5 "

uni t = " minu t e s "/ >

</ i t asg: D ete c t ion >

< its a g:R e a cti o n t i m e = " 1 20 "

uni t = " minu t e s "/ >

< it s a g:D e scr i pti o n lin k = "

http: // www . mosa ic - cloud

. eu / ITS / Att a c k s /

Nes t e d Tag " / >

</ w s a g:Ta r g et >

</ w s ag:K P ITar g et >

</ w s ag : Ser v ice L ev e lOb j ec t i ve s >

</ w s ag: G uar a nte e T erm >

Our solution enables description of attacks, fol-

lowing the proposed approach just in terms of few pa-

rameters:

• Attack just contains the name of the attack.

• Description has several attributes, including a link

to a full, the used language, the description of the

attack and of the possible (supported) detection

systems.

• Detection has an attribute that identiﬁes the sup-

ported detection method and contains the list

of parameters needed to evaluate the detection

model.

• Reaction has an attribute, the time needed to react,

which means that the system may have some side

effects of the attacks for that interval of time.

Such a guarantee term grants the user that each at-

tack listed as KPITarget and detectable with the listed

detection methods will not affect the performances of

the target system. It is up to the IaaS Provider iden-

tify the penalties to be paid in the case in which the

condition is not respected.

8 RELATED WORK

To the best of our knowledge not much work has been

done in the area of conﬁguring security requirements

speciﬁed through WS-Agreement documents. Kar-

joth et al. (Karjoth et al., 2006) introduce the con-

cept of Service-Oriented Assurance (SOAS). SOAS

is a new paradigm deﬁning security as an integral

part of service-oriented architectures. It provides a

framework in which services articulate their offered

security assurances as well as assess the security of

their sub-services. Products and services with well-

speciﬁed and veriﬁable assurances provide guarantees

about their security properties. SOAS enables dis-

covery of sub-services with the right level of secu-

rity. SOAS adds security providing assurances (an

assurance is a statement about the properties of a

component or service) as part of the SLA negotia-

tion process. Smith et al. (Smith et al., 2007) present

a WS-Agreement approach for a ﬁne grained secu-

rity conﬁguration mechanism to allow an optimiza-

tion of application performance based on speciﬁc se-

curity requirements. They present an approach to op-

timize Grid application performance by tuning ser-

vice and job security settings based on user supplied

WS-Agreement speciﬁcation. WS-Agreement de-

scribes security requirements and capabilities in addi-

tion to the traditional WS-Negotiation attributes such

as computational needs, quality-of-service (QoS) and

pricing. Brandic et al. (Brandic et al., 2008) present

advanced QoS methods for meta-negotiations and

SLA-mappings in Grid workﬂows. They approach the

gap between existing QoS methods and Grid work-

ﬂows by proposing an architecture for Grid workﬂow

management with components for meta-negotiations

and SLA-mappings. Meta-negotiations are deﬁned

by means of a document, where each participant

may express, for example, the pre-requisites to be

satisﬁed for a negotiation, the supported negotiation

protocols and document languages for the speciﬁ-

cation of SLAs. In the pre-requisites there is the

element ¡security¿ that speciﬁes the authentication

and authorization mechanisms that the party wants

to apply before starting the negotiation. With SLA-

mappings, they eliminate semantic inconsistencies

between consumer’s and provider’s SLA template.

They present an architecture for the management of

meta-negotiation documents and SLA-mappings and

incorporate that architecture into a Grid workﬂow

management tool.

INTRUSIONTOLERANCEASASERVICE-ASLA-basedSolution

383

9 CONCLUSIONS AND FUTURE

WORKS

In this paper, we have shown how it is possible, us-

ing a Cloud-oriented API derived from the mOSAIC

project, to build up a SLA-oriented Cloud applica-

tion, which enables the management of security fea-

tures related to Intrusion Tolerance against XMl De-

nial of Services attacks to an Infrastructure as a Ser-

vice (IaaS) Cloud Provider. The application that en-

ables SLA management is built in order to receive a

WS-Agreement ﬁle containing a description of the se-

curity features. We proposed a simple schema for de-

scription of the guarantees offered by the system to

the users against DoS attacks. Once the user has ob-

tained an agreement with the SLA management sys-

tem, his requests will be fulﬁlled following the re-

quired SLA and the services will be transparently en-

riched with security features. In our case study, we

enrich a Web Server with an Intrusion Tolerance sys-

tem that grants against a well deﬁned set of attacks.

This work is one of the steps we are doing in the direc-

tion of offering security features in terms of Service

Level Agreement, trough the adoption of the mO-

SAIC SLA architecture. In future steps, we will en-

rich the set of attacks our solutions will be able to face

and try to offer tools to help users to automatically

setup a detailed SLA ﬁlled for his own needs.

ACKNOWLEDGEMENTS

This research is partially supported by FP7-ICT-2009-

5-256910 (mOSAIC) and by MIUR funded project

“Cloud@Home: A new enahnced paradigm”

REFERENCES

Ganglia, a scalable distributed monitoring system for high-

performance computing systems.

Squid: an open source fully-featured http/1.0 proxy.

Tpc benchmark w (tpc-w), a transactional web benchmark.

A. K. Caglayan, P. R. L. and Eckhardt, D. E. (1989). A the-

oretical investigation of generalized voters for redun-

dant system. In The Nineteenth International Sympo-

sium on Fault-Tolerant Computing, pages 444–451.

Brandic, I., Music, D., Dustdar, S., Venugopal, S., and

Buyya, R. (2008). Advanced qos methods for

grid workﬂows based on meta-negotiations and sla-

mappings. 2008 Third Workshop on Workﬂows in Sup-

port of LargeScale Science.

D. Heimbigner., J. K. and Wolf, A. (2002). The willow

architecture: Comprehensive survivability for large-

scale distributed applications. In The Intrusion Toler-

ant System Workshop, pages 71–78.

Ficco, M. (2010). Achieving security by intrusion-tolerance

based on event correlation. International Journal of

Network Protocols and Algorithms, 2, num. 3:70–84.

Ficco, M. and Rak, M. (2011). Intrusion tolerant approach

for denial of service attacks to web services. In The

1st International Conference on Data Compression,

Communications and Processing (CCP 2011), pages

285–292.

Karjoth, G., Pﬁtzmann, B., Schunter, M., and Waidner, M.

(2006). Service-oriented assurance, comprehensive

security by explicit assurances. In Gollmann, D., Mas-

sacci, F., and Yautsiukhin, A., editors, Quality of Pro-

tection, volume 23 of Advances in Information Secu-

rity, pages 13–24. Springer US.

Marsh, M. A. and Schneider, F. B. (2004). Codex: A ro-

bust and secure secret distribution system. In IEEE

Trans. on Dependable and Secure Computing, vol-

ume 1, pages 34–47.

Massimiliano Rak, Salvatore Venticinque, R. A. B. D. M.

(2011). User centric service level management in mo-

saic application. In Press, I., editor, Europar 2011

Workshop.

mOSAIC Project (2010). mosaic: Open source api and

platform for multiple clouds. http://www.mosaic-

cloud.eu.

N. F. Neves, P. S. and Verissimo, P. (2006). Proactive re-

silience through architectural hybridization. In The

ACM Symp. on AppliedComputing (SAC’06).

P. Kouznetsov, A. H. and Druschel, P. (2006). The case for

byzantine fault detection. In The 2nd Workshop on

Hot Topics in System Dependability.

R. Mista, D. Bakken C., D. A. and Medidi, M. (2002). Mr-

fusion: A programmable data fusion middleware sub-

system with a tunable statistical proﬁling service. In

The Int. Conference on Dependable Systems and Net-

work (DSN-2002), pages 273–278.

Rak, M., Liccardo, L., and Aversa, R. (2011). A sla-based

interface for security management in cloud and grid

integrations. In Abraham, A. et al., editors, Proceed-

ings of the 2011 7th International Conference on In-

formation Assurance and Security (IAS). IEEE Press.

Smith, M., Schmidt, M., Fallenbeck, N., Schridde, C., and

Freisleben, B. (2007). Optimising Security Conﬁgura-

tions with Service Level Agreements. In Proceedings

of the 7th International Conference on Optimization:

Techniques and Applications (ICOTA 2007), pages

367–381. IEEE Press.

van Sinderen F. Leymann, I. I. M., – Science, B. S. S., and

Publications, T., editors (2011). Towards a cross plat-

form Cloud API. Components for Cloud Federation.

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

384