Improving Cloud Survivability through Dependency

based Virtual Machine Placement

Min Li

, Yulong Zhang

, Kun Bai

, Wanyu Zang

, Meng Yu

and Xubin He

Computer Science, Virginia Commonwealth University, Richmond, U.S.A.

IBM T.J. Watson Research Center, Cambridge, U.S.A.

Electrical and Computer Engineering, Virginia Commonwealth University, Richmond, U.S.A.

Keywords:

Cloud Computing, Virtual Machine Placement, Security, Survivability.

Abstract:

Cloud computing is becoming more and more popular in computing infrastructure and it also introduces

new security problems. For example, a physical server shared by many virtual machines can be taken over

by an attacker if the virtual machine monitor is compromised through one of the virtual machines. Thus,

collocating with vulnerable virtual machines, or “bad neighbours”, on the same physical server introduces

additional security risks. Moreover, the connections between virtual machines, such as the network connection

between a web server and its back end database server, are natural paths of attacks. Therefore, both virtual

machine placement and connections among virtual machines in the cloud have great impact over the overall

security of cloud. In this paper, we quantify the security risks of cloud environments based on virtual machine

vulnerabilities and placement schemes. Based on our security evaluation, we develop techniques to generate

virtual machine placement that can minimize the security risks considering the connections among virtual

machines. According to the experimental results, our approach can greatly improve the survivability of most

virtual machines and the whole cloud. The computing costs and deployment costs of our techniques are also

practical.

1 INTRODUCTION

Cloud computing is becoming predominant in com-

puting infrastructure since it provides the ﬂexibil-

ity and cost-effectiveness hardly found in traditional

computing platforms. The key technique of cloud

computing is resource sharing and dynamic resource

allocation of the cloud. In an Infrastructure as a Ser-

vice (IaaS) cloud like Amazon EC2, multiple virtual

machines (VMs) share a physical server. Thus, the

security of VMs is dependent not only on how secure

the Operating System and applications they are run-

ning, but also the security of Virtual Machine Monitor

(VMM, or hypervisor), running below the VMs.

There are many attacks developed against cloud

environments. In this paper, we are interested in two

types of attacks since they are related to how VMs

are placed in a cloud. Type I attacks, such as (CVE-

2007-4993, 2007; CVE-2007-5497, 2007), exploit

the vulnerabilities of hypervisors, eg., Xen and KVM.

Once succeed, attackers can compromise the physi-

cal server running the hypervisor and all VMs run-

ning above the hypervisor. Alternatively, Type II at-

Start

Traditional

Attack

Compromise hypervisor

(Type I attack)

Side Channal Attack

(Type II Attack)

Analyze other VMs

above the hypervisor

Select a target server

through dependencies

Traditional

Attack

Figure 1: The State Transition Graph of Attacks. S

: One

VM compromised on a new server. S

: Hypervisor com-

promised. S

: Dependency information collected. S

: New

target server selected.

tacks compromise other VMs on the same physical

server through mounting side channel attacks (Ris-

tenpart et al., 2009; Hlavacs et al., 2011), instead of

compromising the hypervisor.

As shown in Figure 1, attackers can utilize both

types of attacks to compromise as many VMs as pos-

sible in the cloud. In the ﬁrst step, the attacker can

compromise one guest VM (Dom U) or the manage-

321

Li M., Zhang Y., Bai K., Zang W., Yu M. and He X..

Improving Cloud Survivability through Dependency based Virtual Machine Placement.

DOI: 10.5220/0004076003210326

In Proceedings of the International Conference on Security and Cryptography (SECRYPT-2012), pages 321-326

ISBN: 978-989-8565-24-2

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

ment VM (Dom 0) through the vulnerabilities in op-

erating system or services.Consequently, the attacker

can launch side channel attacks to collect information

of other VMs on the same physical server. The in-

formation can also be collected using type I attacks,

as shown in Figure 1. Because network connections,

e.g., from a web server to a back end database server,

may leak security information such as authentica-

tion information on the database server, the attacks

can cause cascading effects, or domino effects in the

cloud. Also, the connections among VMs become

paths of attacks,

For example, the attack “Hey, you get out of my

cloud” (HYG) (Ristenpart et al., 2009) is one kind of

type II attack. The initial stage of the HYG attack is

to locate the target VM. Once success is achieved, the

attacker will try to launch a VM on the same physical

server. It is a placement based attack and the success

of the attack depends on the placement strategies of

the cloud, or the conﬁguration policy of the cloud.

In this paper, we present an approach of virtual

machine placement based on the security evaluation

of VMs and the dependencies among them. Our tech-

nique not only increases the survivability of cloud but

also is compatible with performance requirements. In

security evaluation, we use Discrete Time Markov

Chain (DTMC) to analyze the possibility of being

compromised for each VM. In performance evalua-

tion, we calculate both migration cost and the dis-

tance of placement. Based on performance and se-

curity analysis, we generate a new VM placement.

To the best of our knowledge, this paper is the

ﬁrst effort to develop the following mechanisms and

techniques to enhance cloud security through chang-

ing cloud placement. The contributions of this paper

are summarised as below:

• We propose a systematic approach to evaluating

security risks of a cloud placement plan.

• Based on our security evaluation, we propose an

innovative and practical approach to generate a

safe placement plan.

• Our placement generation is compatible with per-

formance requirements since it is also based on

dependencies among VMs.

• The experimental results conﬁrm that our place-

ment plan can signiﬁcantly improve the surviv-

ability of VMs in cloud.

In section 2, we review related work about VM

placement and indicate our unique contributions. Sec-

tion 3 deﬁnes the attack model and describes our goals

of placement. In Section 4, we explain the architec-

ture of our implementation, dependency based VM

placement strategy, and other implementation details.

In Section 5, we show how well our technique per-

form over real data.

2 RELATED WORK

A good VM placement plan can immensely improve

the performance. For example, (Sindelar et al., 2011)

demonstrated that memory share based placement can

save the memory resources of a server. In their pro-

posed placement scheme, VMs with the most shared

pages are collocated in the same physical server.

To improve efﬁciency, (Yusoh and Tang, 2010)

developed a generic algorithm to create a place-

ment plan to reduce Estimated Total Execution Time

(ETET). Work (Lucas Simarro et al., 2011) provided

a scheduling model to optimize virtual cluster place-

ment through cloud offers. The cloud prices and user

demand have been considered in the model. The ex-

perimental results on the real data show that dynamic

placement plan can bring more beneﬁts on reducing

users’ costs than the ﬁxed one.

Unfortunately, none of the above work considered

the security issue. Our previous work (Zhang et al.,

2012) proposed to periodically migrating VMs based

on game theory, making it much harder for adver-

saries to locate the target VMs in terms of survivabil-

ity measurement. However, our previous work did

not discuss how to evaluate the security of a cloud

placement and how to generate a placement plan to

improve the cloud security. This paper proposes an

innovative and effective placement strategy based on

dependency relations among VMs.

3 SYSTEM OVERVIEW

In this section, we describe our basic assumptions and

the goals of VM placement.

3.1 Characteristics

An example of virtual machine placement is shown

in Figure 2. The most important component is Vir-

tual Machine and Node. Each virtual machine runs

different services and some of the VMs are depen-

dent on others. Node represents a physical machine

which runs a few to many VMs, given the limit of

hardware resource. In Figure 2, Node 1 has three

VMs while Node 2 holds four VMs. Besides, Cloud

Provider has necessary privileges to scan vulnerabil-

ities of VMs and obtain information of network con-

nections among VMs.

SECRYPT2012-InternationalConferenceonSecurityandCryptography

322

Node 2Node 1

Figure 2: Cloud Placement Example.

In addition, We assume the following about an at-

tacker.

1. The attacker can exploit the vulnerability of a hy-

pervisor or a VM.

2. The attacker follows the state transition graph in

Figure 1 to compromise VMs step by step.

3. The attacker always chooses the easiest target to

compromise in each step, in terms of the vulnera-

bilities in VMs and the attacker’s skills.

4. The attacker has no global view of the cloud at the

beginning of the attacks. However, the attacker

may acquire more knowledge after compromising

more nodes in the cloud.

Since the success of attacks highly depends on the

placement strategy of cloud. Our purpose is to present

systematic solution which reduces the security risks

for both Type I and Type II attacks while not sacriﬁc-

ing performance.

We can defeat Type I attacks because our mech-

anism will change the VM placement strategy after

a speciﬁc time. Hence, if an adversary plan to com-

promise a speciﬁed VM through compromising hy-

pervisor and the whole node. The VM can survive

if it can be migrated to other node before an attacker

compromise the node. In addition, we can also resist

Type II attacks because we try to assign dependent

VMs in the same node. In this situation, it is difﬁcult

for the adversary to compromise other nodes. Hence,

we increase the survivability of VMs on other node.

In this work, we try to both reduce the number of

compromised VMs and increase the survivability

of services.

To verify the improvement of survivability of ser-

vice, we deﬁne the survivability of service. Service is

accomplished by a set of connected VMs, which are

deﬁned as VMs have data transmission.Next, we pro-

vide the deﬁnition of compromised possibility of VM

that the possibility of being compromised for a given

VM at speciﬁc associate attack step. According to the

deﬁnition, we can give a theorem about how to evalu-

ate the survivability of service.

Security Evaluation

Exploitable Possibility

Markov Chain Analysis

Generate Placement Strategy

Performance Evaluation

Migration Cost

Adminstrator Preference

Dependency Exploration

Figure 3: Architecture.

Theorem 1 (Survivability of a Service). Given

a service S

(VM chain) including some related

= {VM

, VM

, ..., VM

} and node set N =

, N

, ..., N

}, If the survivability in speciﬁc attack

step for the Nodes which hold the VM belong to S

is {PN

, PN

, ..., PN

}, Then survivability (PS) for

service S

is below:

∏

j=1

. (1)

Now we change the question from how to evaluate

a service to the one how to evaluate a node. Then, cal-

culating the node survivability in Equation 1 becomes

the critical problem. To solve the problem, we deﬁne

the survivability of a node.

Theorem 2 (Survivability of a Node). Given a node

N and a set of VMs = {VM

, VM

, ...VM

} which lo-

cate at node N, and the compromised probability for

these VMs are {P

, P

, ..., P

}, the survivability (PN)

for Node N is below:

∏

j=1

(1− P

) (2)

Because we assume if an adversary takes over a VM,

the physical node will highly possibly be compro-

mised. In other words, survivability possibility of a

physical node is possibility that all owned VMs sur-

vive in this attack, so we can obtain equation 2. Sim-

ilarly, survivability possibility of service is possibil-

ity that all VMs which constitute this service can sur-

vive in this attack. In addition, because we assume if

the node is compromised, all VMs on this node can’t

work, the survivability possibility of VM is equal to

survivability possibility of physical node, so we can

acquire Equation 1.

4 IMPLEMENTATION

The structure of our implementation is shown in Fig-

ure 3. In order to improve the overall security level

ImprovingCloudSurvivabilitythroughDependencybasedVirtualMachinePlacement

323

while not sacriﬁcing performance, our design in-

cludes three components: security evaluation, strat-

egy generation, and performance evaluation. Peri-

odically, the cloud provider changes VM placement

to defend against placement based Attacks. First, a

dependency exploration mechanism identiﬁes the ser-

vice dependencyrelations among VMs. In our design,

we identify the dependency relations through network

connections. Since the operating system maintains

the information about network port and IP address,

we collect the network information through the oper-

ating system. In the following, we score each VM’s

security level according to the National Vulnerabil-

ity Database (NVD). Afterwards, we map the vulner-

abilities to the possibility of compromise and lever-

age Discrete Time Markov Chain Analysis (DTMC)

to predict the possibility of successful attacks in each

attack step. Finally, we design an algorithm to create

the placement plan. Integrated with migration cost

and migration time analysis, we can conclude the ﬁ-

nal placement solution.

4.1 Security Evaluation

The Security Evaluation component consists of ex-

ploitable possibility assessment and Markov chain

analysis. First, we explore dependency relations

among all VMs and construct a graph based on the

dependencies. Second, we scan each VM against the

National Vulnerability Database to generate a esti-

mate value in terms of the VM’s security level. Third,

we use Markov chain to predict the possibility of be-

ing attacked for each VM.

4.1.1 Dependency Exploration

The dependency relations among VMs is the basis

of security evaluation. There are already many re-

search on how to discover dependency relations be-

tween VMs. For example, LWT (Apte et al., 2010)

identiﬁes the cross-domain dependencies by CPU uti-

lization. The authors claim that there will be the

same spike in the CPU utilization of dependent VMs.

Hence how to identify the dependency of VMs is out

of the scope of our work. In this paper, we sim-

ply use network topological structure information like

IP address and network port numbers generated by

netstat to identify dependency relations. After all

dependency relations are obtained, we can construct

the VM Dependency Graphs like the one shown in

Figure 2.

0.5

0.7

0.3

0.8

0.2

0.1

0.9

NODE1

NODE2

(a) Step One.

0.5

0.7

0.8

0.2

0.9

NODE1

NODE2

(b) Step Two.

Figure 4: An example based on Markov Chain Analysis.

4.1.2 Exploitable Vulnerability

In order to quantify the exploitable vulnerability,

we use the Common Vulnerability Scoring Sys-

tem(CVSS) (CVSS, 2012). CVSS includes three met-

rics group: base, temporal, and environment. Each of

the metrics represents different characters of vulnera-

bilities.

Now we have the quantiﬁed vulnerability for each

VM. We need to map the quantiﬁed vulnerability

to the possibility of compromise for each connected

VM. Here we use a linear mapping function. Given

a VM VM

and the vulnerability score of VMs con-

nected with VM

is {V

, V

, ..., V

}, the possibility of

compromise for VM

, denoted by P

, is given by

Equation 3.

∑

k=1

(3)

The linear mapping Function 3 is very simple,

however, more complex mapping functions can be

used and the following discussions will remain the

same.

4.1.3 Markov Chain Analysis

We run Markov Chain analysis over an Attack Depen-

dency Graph which is deﬁned as follows.

Deﬁnition 1. Attack Dependency Graph. Given the

Dependency Attack Graph GhV, Ei of an attack and

two virtual machines v

and v

, where V is the set of

virtual machine, and E is the set of transitions among

attack, if there exists an edge e ∈ E from v

to v

then v

is attack dependent on v

, denoted by v

→ v

Besides, the possibility of transition is determined by

Equation 3

All virtual machines in the cloud and attack de-

pendencies among them can be represented by one or

a set of attack dependencygraph(s) (ADG). Besides if

we associate a probability to each edge, an ADG can

SECRYPT2012-InternationalConferenceonSecurityandCryptography

324

be modelled by a Discrete Time Markov Chain (Sah-

ner et al., 1997). Given n nodes in the ADG, the initial

probability distribution on each node (that a particular

node is compromised) is

(0) = (1, 0, 0, . . . , 0

{z }

n−1

). After

the k

step, the probability that the attacker can reach

other nodes in the ADG can be calculated by Equa-

tion 4.

(n) =

(0)P

(4)

where P is the state-transition probability matrix

of DTMC and P = P · P ···P

{z }

. P is given by {a

where a

is the probability associated to edge v

→ v

The initial P can be determined by attacker’s ﬁrst

choice. If attacker can compromise v

from v

based

on compromised possibility which has been scored

by CVSS and mapped by equation 3, then we assign

as this score. In order to eliminate loop in ADG,

we will remove the compromised node and its edges.

Here, to simplify our work, we just convert the exam-

ple (Figure 2) to ADG format and assume that

the probability to each transition to the successor.if v

has m successors. However, it’s not hard to assign the

initial P using CVSS (CVSS, 2012).

In Figure 2, we assume that the ﬁrst compromised

VM should be VM

, so

(0) = {0 1 0 0 0 0 0}. Hence

we may obtain the ADG for step one and step two in

Figure 4. Finally attack possibility for 6 attack step

from above assumption.







(0)

(1)

(2)

(3)

(4)

(5)













0 1 0 0 0 0 0

0 0 0.2 0 0 0.8 0

0 0.26 0 0.56 0 0 0.18

0.56 0 0.26 0 0 0.18 0

0 0.026 0 0 0 0 0.414

0 0 0.026 0 0 0.414 0







(5)

where

(k), 0≤ k ≤ 5 is the probability distribution in

step k. In the above example, according to

(4), in the

step, the probability that VM

is compromised will

be 2.6% and the probability that VM

is compromised

will be 41.4%.

4.2 Placement Generation

From the above discussion, we have gained the pos-

sibility of being attacked for each VM. Therefore, we

design an algorithm to generate placement plans. The

core part of the algorithm is to separate VMs with

high risks from VMs with low risks. The algorithm

tries to assign VM with connections to same node.

In the ﬁrst part of the algorithm, we separate VMs

from others. The VMs with high risks, called danger-

ous VMs are identiﬁed by DTMC analysis. In DTMC

analysis, at a speciﬁc step, if the probability of being

compromised of a VM is larger than zero, the VM has

high security risk in this step. A VM compromised in

earlier steps is considered at higher risk level than the

ones compromised at later steps. To satisfy our goal

functions, we sort VM set in descending order of at-

tack possibility. Then, we assign each node a VM

from the ﬁrst one of VM set. Then we choose the

node with minimal attack possibility to hold the rest

of VMs. If the current node is full, we choose next

node until VM set is empty. To minimize the num-

ber of migratedVM, our placement plan will consider

the previous plan. For example, if one VM belongs to

the physical node without dangerous VMs in preced-

ing placement and for new placement plan, this VM is

still put into a node without dangerous VMs. Then we

will not migrate this VM because migration this kind

of VM will not increase VM security but decrease the

migration performance.

4.3 Performance Evaluation

In this section, we discuss the overhead of our place-

ment mechanism. We will look at the following types

of costs: the computing costs of placement, the mi-

gration time of a VM, and the total number of migra-

tion needed to achieve a new placement, or migration

path.

4.3.1 Computing Costs

The computing costs include the costs of DTMC and

the algorithm. The algorithm complexity is polyno-

mial. The cost of DTMC, denoted by C

DTMC

, is de-

ﬁned as the following.

Deﬁnition 2 (Cost of DTMC). Given a series of K

Attack Steps and the cost for step i is PDTMC

, the

performance cost for this series of attack steps is

DTMC

∑

i=1

PDTMC

(6)

According to our experiments, the costs of matrix

multiplication for each attack step in DTMC is poly-

nomial time complexity is PDTMC

= i

. Our exper-

iment result of DTMC shows calculation complex-

ity for a cloud with 2048 VMs in term of 7 steps,

DTMC

∑

i=1

PDTMC

= 38.36s. In general, regard-

ing with a cloud with N VMs in term of M steps, the

total calculation for DTMC is

∑

i=1

PDTMC

4.3.2 Migration Time and Migration Path

When migrating a VM, the VM is usually shut off

ﬁrst, hence, migration time is one of the most signif-

icant factor we should consider in order to improve

ImprovingCloudSurvivabilitythroughDependencybasedVirtualMachinePlacement

325

0 2000 4000 6000 8000 10000

Elapsed time

Delay(ms)

967ms

Figure 5: Migration impact on response delay of web server.

As illustrated in the graph, the web service downtime due

to migration is 967ms.

0 10 20 30 40 50 60 70 80

0.2

0.4

0.6

0.8

1.2

Service#

Survivability possibility

Random Placement

New Placement

Figure 6: Comparison of survivability.

the system performance. Figure 5 presents a migra-

tion delay of Web server on our platform.

The cost on migration path happens when a new

placement plan is deployed. The cost may differ if

the VMs are migrated in a different order because im-

migrated VM should wait until the target node has

enough space. In addition, we should choose sus-

pended VMs to migrate ﬁrst because migrating sus-

pended VMs will not cause performance loss. There-

fore, we should try to choose a migration path with

minimum costs. Calculation of the optimal migration

path is out of the scope of this paper due to limit of

space.

5 EXPERIMENTAL RESULTS

We apply our placement algorithm to the data set to

generate placement plans. The data set includes 81

VMs on 10 node. The capacity for 10 nodes are

20,15,10,10,10,5,5,5,5,5. Based on the data set, we

generated a random placement plan and optimize the

placement using our algorithm. We compared the new

placement plan with the random one to investigate the

improvement of security levels.

According to our experimental results shown in

Figure 6, 91.3% services obtained improved surviv-

ability. The maximum survivability enhancement is

74.28% and the average improvement of survivability

possibility is 27.15%. Our results also show reduced

number of compromised VMs.

ACKNOWLEDGEMENTS

This work was supported in part by NSF Grants CNS-

1100221 and CNS-0905153.

REFERENCES

Apte, R., Hu, L., Schwan, K., and Ghosh, A. (2010). Look

who’s talking: discovering dependencies between vir-

tual machines using cpu utilization. In Proceedings

of the 2nd USENIX conference on Hot topics in cloud

computing, HotCloud’10, pages 17–17, Berkeley, CA,

USA. USENIX Association.

CVE-2007-4993 (2007). Cve-2007-4993: Xen guest

root can escape to domain 0 through pygrub.

http://cve.mitre.org/cgibin/cvename.cgi?name=CVE-

2007-4993, 2007.

CVE-2007-5497 (2007). Cve-2007-5497: Vul-

nerability in xenserver could result in privi-

lege escalation and arbitrary code execution.

http://support.citrix.com/article/CTX118766, 2007.

CVSS (2012). Common vulnerability scoring system.

http://www.ﬁrst.org/cvss/cvss-guide.

Hlavacs, H., Treutner, T., Gelas, J., Lefevre, L., and Orgerie,

A. (2011). Energy consumption side-channel attack

at virtual machines in a cloud. In Dependable, Au-

tonomic and Secure Computing (DASC), 2011 IEEE

Ninth International Conference on, pages 605 –612.

Lucas Simarro, J., Moreno-Vozmediano, R., Montero, R.,

and Llorente, I. (2011). Dynamic placement of virtual

machines for cost optimization in multi-cloud envi-

ronments. In High Performance Computing and Sim-

ulation (HPCS), 2011 International Conference on,

pages 1 –7.

Ristenpart, T., Tromer, E., Shacham, H., and Savage, S.

(2009). Hey, you, get off of my cloud: exploring infor-

mation leakage in third-party compute clouds. In Pro-

ceedings of the 16th ACM conference on Computer

and communications security, CCS ’09, pages 199–

212, New York, NY, USA. ACM.

Sahner, R., Trivedi, K., and Puliaﬁto, A. (1997). Perfor-

mance and reliability analysis of computer systems

(an example-based approach using the sharpe soft-

ware. Reliability, IEEE Transactions on, 46(3):441.

Sindelar, M., Sitaraman, R. K., and Shenoy, P. (2011).

Sharing-aware algorithms for virtual machine colo-

cation. In Proceedings of the 23rd ACM symposium

on Parallelism in algorithms and architectures, SPAA

’11, pages 367–378, New York, NY, USA. ACM.

Yusoh, Z. and Tang, M. (2010). A penalty-based genetic al-

gorithm for the composite saas placement problem in

the cloud. In Evolutionary Computation (CEC), 2010

IEEE Congress on, pages 1 –8.

Zhang, Y., Li, M. L., Bai, K., Yu, M., Zang, W., and He, X.

(4-6 June 2012). Incentive compatible moving target

defense against vm-colocation attacks in clouds. In

IFIP International Information Security and Privacy

Conference 2012.

SECRYPT2012-InternationalConferenceonSecurityandCryptography

326