EVALUATING SURVIVABILITY AND COSTS OF THREE VIRTUAL

MACHINE BASED SERVER ARCHITECTURES

Meng Yu

, Alex Hai Wang

, Wanyu Zang

and Peng Liu

Western Illinois University, IL, Macomb, U.S.A.

Pennsylvania State University, PA, University Park, U.S.A.

Keywords:

Security modeling, Survivability, Security architecture, Software security, Data center.

Abstract:

Virtual machine based services are becoming predominant in data centers or cloud computing since virtual

machines can provide strong isolation and better monitoring for security purposes. While there are many

promising security techniques based on virtual machines, it is not clear how signiﬁcant the difference between

various system architectures can be in term of survivability.

In this paper, we analyze the survivability of three virtual machine based architectures — load balancing ar-

chitecture, isolated service architecture, and BFT architecture. Both the survivability based on the availability

and the survivability under sustained attacks for each architecture are analyzed. Furthermore, the costs of each

architecture are compared. The results show that even if the same set of commercial off the shell (COTS) soft-

ware are used, the performance of various service architectures are largely different in surviving attacks. Our

results can be used as guidelines in the service architecture design when survivability to attacks is important.

1 INTRODUCTION

Virtual machine technology provides strong isolation

and better monitoring capability at the virtual ma-

chine monitor level. Once attacks happen, though it

is possible, it is hard for the attacker to break into the

virtual machine monitor to compromise other virtual

machines or avoid being monitored. Therefore, vir-

tual machine technology is widely used in cloud com-

puting and data centers as a preliminary approach in

various service architectures.

When using the same set of commercial off the

shell (COTS) software and the virtual machine tech-

nology, the service architectures can be very differ-

ent. Accordingly, the security characteristics of each

architecture will be different too. In this paper, we use

three virtual machine based architectures as examples

to evaluate such differences with regard to survivabil-

ity and costs. While we evaluate three speciﬁc archi-

tectures in the paper, our techniques used for evalua-

tion are general enough to be applied on other archi-

tectures.

There have been many techniqeus for evaluating

attacks or defense, such as attack graphs (Sawilla and

Ou, 2008), attack tree (Mauw and Oostdijk, 2005),

stochastic activity network (Sanders et al., 2001),

stochastic petri-Net (Marsan, 1990), reliability block

diagram (RBD) (Sahner et al., 1996b), queu-

ing networks and Continuous Time Markov Chain

(CTMC) (Tijms, 1994; Sahner et al., 1996a). The

aforementioned techniques have been used to evalu-

ate some server architectures, like web service archi-

tectures (Gokhale et al., 2006), or data ﬂow software

architecture (Padilla et al., 2008) with regard to the

reliability, availability, and performance. Models that

can be used to evaluate the dependability and secu-

rity have been summarized in (Nicol et al., 2004).

However, survivability of virtual machine based ar-

chitectures have not been investigated yet, especially

the architecture with COTS service components.

The major contribution of the paper includes: 1)

evaluating the impact of various architecture designs

on both the static and dynamic survivability; 2) study-

ing several survivability related metrics in each archi-

tecture; and 3) comparing the costs of these architec-

tures to see how much we need to pay for a speciﬁc

survivability level. To our best knowledge, our work

is the ﬁrst work to analyze the survivability, and per-

formance for the virtual machine based architectures.

The paper is organized as follows. We review re-

lated work in Section 2. In Section 3, we describe

three virtual machine based architectures to be eval-

uated. We analyze the survivability statically in Sec-

tion 4 and analyze the dynamic behaviors of each ar-

478

Yu M., Hai Wang A., Zang W. and Liu P. (2010).

EVALUATING SURVIVABILITY AND COSTS OF THREE VIRTUAL MACHINE BASED SERVER ARCHITECTURES.

In Proceedings of the International Conference on Security and Cryptography, pages 478-485

DOI: 10.5220/0002994604780485

 SciTePress

chitecture in Section 5. In Section 6, we compare the

costs of different architectures. We conclude the pa-

per in Section 7.

2 RELATED WORK

Data centers using virtual machine (VM) consolida-

tion are taking over old computer rooms run by in-

dividual companies. The reasons include space and

energy bills. According to a report from research ﬁrm

Gartner, millions of virtual machines have been or are

being deployed in data centers around the world, and

virtualization is becoming a dominant indispensable

technology for IT departments.

Due to resource and service consolidation, data

centers are becoming the “backbone” of the infras-

tructure for IT operations in companies, governments,

and military. Accordingly, two top requirements for

modern data centers are business continuity and in-

formation security. Although these two requirements

clearly show the importance of data center protection,

from the security viewpoint, consolidating services

and resources does not automatically consolidate the

corresponding security mechanisms. Without secu-

rity consolidation, the cost of protection can be much

higher that it should be, and more importantly, blindly

reusing the separate security equipment and tools as-

sociated with the services/resources being consoli-

dated could even “create” new security holes. Unfor-

tunately, current security consolidation has been lag-

ging behind service/resource consolidation in the data

center industry.

Replicated systems can provide better fault toler-

ance, or intrusion tolerance when attacks are unknown

attacks that can be modeled as Byzantine faults (Cas-

tro, 2001; Schneider, 1990; Bernstein et al., 1987;

Seguin et al., 1979; Jajodia and Mutchler, 1990;

Alvisi et al., 2001; Malkhi and Reiter, 1998; Castro,

2001). While the performance was a concern of these

approaches, recent advance of research leads to more

pratical performance, e.g., in Zyzzyva (Kotla et al.,

2007). the peak throughput achieved by Zyzzyva

is within 35% of that of an unreplicated server that

simply replies to client request over an authenticated

channel. With replications, compromised nodes can

be removed through a voting mechnism. Therefore,

the attacker have to compromise replicas more than a

thredhold number, usually more than ⌊

n−1

⌋ replicas,

of a replicated system with n replicas to disable the

system. However, an attacker can easily compromise

all replicas through a common vulnerability shared

by all replicas, especially when all replicas are run-

ning homoenous environments, e.g., the same operat-

Application

Virtual Machine Monitor (VMM)

Load Balancer

OS 1

DBMS

Web server

server

Application

OS 2

DBMS

Web server

server

Application

OS i

DBMS

Web server

server

Application

OS n

DBMS

Web server

server

Figure 1: Load Balanced Server Architecture (LBSA).

ing systems or web servers.

The basic idea of diverse replication using Virtual

Machines has been described in (Chun et al., 2008).

The combination of diversiﬁcation and replication is

capable of defeating unknown attacks with practi-

cal costs. Proper conﬁgured diverse replication will

be immune to attacks based on single vulnerability

and can remove compromised node even if less then

⌊

n−1

⌋ nodes are compromised. However, the method

has not been implemented or evaluated with respect

to the effectiveness or performance.

Current evaluation techniqeus for attacks or de-

fense, such as aforementioned attack graphs (Saw-

illa and Ou, 2008), attack tree (Mauw and Oost-

dijk, 2005), queuing networks and Continuous Time

Markov Chain (CTMC) (Tijms, 1994; Sahner et al.,

1996a), and so on, have ben used to evaluate some

server architectures. Models that can be used to evalu-

ate dependability and security have been summarized

in (Nicol et al., 2004). However, survivability of vir-

tual machine based architectures have not been inves-

tigated yet, especially the architecture with COTS ser-

vice components.

3 THREE VIRTUAL MACHINE

BASED ARCHITECTURES

In this section, we describe three virtual machine

based service architectures - Load Balanced Server

Architecture (LBSA), Isolated Component based

Server Architecture (ICSA), and Byzantine Fault Tol-

erant (Castro and Liskov, 1999) Server Architecture

(BFTSA).

The LBSA architecture is shown in Figure 1. This

is the most popular architecture to provide high avail-

ability services. In this architecture, a set of services

are installed on each virtual machine of a virtual ma-

chine cluster. The users’ requests are directed to alive

virtual machines with load balancing. In a secure con-

ﬁguration, the virtual machine environments are di-

versiﬁed, which guarantees that a single instance of

EVALUATING SURVIVABILITY AND COSTS OF THREE VIRTUAL MACHINE BASED SERVER

ARCHITECTURES

479

Virtual Machine Monitor (VMM)

DBMS

server

Application

Web server

OS OS

Figure 2: Isolated Component based Server Architecture

(ICSA).

OS i

Virtual Machine Monitor (VMM)

BFT Protocol

OS 1

DBMS

Web server

server

Application

OS 2

DBMS

Web server

server

Application

DBMS

Web server

server

Application

OS n

DBMS

Web server

server

Application

Figure 3: Byzantine Fault Tolerant Server Architecture

(BFTSA).

attack cannot compromise all virtual machines in the

cluster. Moreover, even if one virtual machine is com-

promised, the load balancer can detect the inconsis-

tency, give up or reboot the compromised virtual ma-

chine with a different diversiﬁed conﬁguration. Since

virtual machines provide good isolation, we assume

that a single instance of attack cannot compromise all

virtual machines of the cluster.

The ICSA architecture is shown in Figure 2. In

this architecture, each service of a service architec-

ture is installed on a virtual machine. The number of

virtual machines is determined by the number of ser-

vices. In this ﬁgure, we have three services thus we

have three virtual machines. Once a service is com-

promised, it is conﬁned in its virtual machine and can

be recovered through a reboot of the service. The vir-

tual machine isolation mechanism reduces the num-

ber of reboots in the situation of attacks.

The BFT architecture is shown in Figure 3. A

Byzantine Fault Tolerant (BFT) algorithm (Castro and

Liskov, 1999) is required for this architecture to han-

dle users’ requests. Once a user’s request is received,

the BFT algorithm goes through three phases: pre-

prepare, prepare, and commit to remove replies from

nodes with arbitrary faults. The BFT algorithm can

tolerate ⌊

n−1

⌋ compromised servers.

Note that in all architectures, virtual machines

should be conﬁgured with different diversiﬁcation,

which ensures that no more than one virtual machine

can be compromised by one instance of attack.

4 EVALUATING SURVIVABILITY

BASED ON STATIC ANALYSIS

4.1 Basic Assumptions

In this paper, we assume that the defense mechanisms

used in the virtual machines are based on diversiﬁca-

tion (Chun et al., 2008). Thus, a single instance of

attack cannot compromise more than one virtual ma-

chine in any architecture no matter what technique the

attacker uses, such as buffer overﬂow, stack overﬂow,

format string, and etc. However, each instance of at-

tack can have a positive probability to compromise a

virtual machine, thus any component inside the vir-

tual machine. Furthermore, attacks can include ﬁnite

steps on the target, so virtual machines in an archi-

tecture may be compromised one by one until all are

compromised but not in a single step.

We also assume that the defense mechanisms have

an automated recovery procedure included. Once a

virtual machine is detected as compromised, the re-

covery mechanism can recover the virtual machine

with a different diversiﬁcation through reboot or

micro-reboot.

We deﬁne the survivability of a system under spe-

ciﬁc condition as follows.

Deﬁnition 1 (Survivability). We deﬁne the surviv-

ability of a system under a speciﬁc condition as the

probability that the system can meet the following two

requirements.

Availability. The system can provide replies to all re-

quests, and

Service Integrity. The replies meet the functional

speciﬁcation of the system and all service com-

ponents of the system are functional.

For example, a web service architecture needs three

level of servers: web server, application server, and

DBMS server. When a system is under attacks, as

long as the system can provide replies (availability)

and the replies are generated by the three levels of

servers correctly (service integrity), we say that the

system can survive the attacks. Otherwise, the whole

system is compromised.

4.2 Analytical Results

The LBSA architecture is good at providing avail-

ability. A system under such architecture will pro-

vide services unless all virtual machines are crashed.

However, the service integrity is another story. If any

virtual machine is compromised by the attacker, the

service integrity is compromised (note that we are

SECRYPT 2010 - International Conference on Security and Cryptography

480

discussing attacks instead of “fail-stop” failures), be-

cause the load balancer cannot tell a virtual machine

demonstrating arbitrary faults from other normal vir-

tual machines. Therefore, a single compromised vir-

tual machine leads to a compromised cluster.

Under the ICSA architecture, once a virtual ma-

chine is compromised, the installed service will be

gone. Thus, both the availability and service integrity

are compromised. As the results, the whole system is

compromised.

For both LBSA and ICSA architectures, assume

that a system has n virtual machines. Note that for

ICSA, n is usually the same as the number of services

unless multiple services are installed on the same vir-

tual machine. If the virtual machines are not diversi-

ﬁed, the probability breaking into the system through

a single vulnerability, denoted by P

, will be the same

as the probability breaking into a single replica, de-

noted by P

, because the vulnerability is shared by

all replicas. Thus, the survivability of the system

= 1− P

. With disjoint diversiﬁcation in a

space S, where no attack can compromise more than

one variation at a time and S contains all possible vari-

ations of the diversiﬁcation, P

can be calculated by

Equation 1.

= 1−

∑

i=1





)

(1− P

)

m−i

= 1−

∑

i=1





(

|S|

)

(1−

|S|

)

m−i

(1)

where n ≤ |S| (n nodes cannot have more than |S|

variations), and m > 1 (we need to compromise at

least 1 replicas) is the total number of intrusion at-

tempts.

With k independent diversiﬁcation approaches,

e.g., diversiﬁed API, memory randomization, etc.,

the diversiﬁcation space will be S

, S

, . . . , S

respec-

tively. Thus, for a sequence of m intrusion attempts,

= 1−

∑

i=1





(

∏

j=1

)

(1−

∏

j=1

)

m−i

(2)

Under the BFTSA architecture, we assume a

BFT (Castro, 2001; Kotla et al., 2007) server group

with n replicated virtual machines. Because BFT

protocol can tolerate ⌊

n−1

⌋ compromised virtual ma-

chines, a system will be compromised if more than

⌊

n−1

⌋ virtual machines in the cluster are compro-

mised. Therefore, the survivability can be calculated

by Equation 3.

= 1−

∑

i=⌊

n−1

⌋+1





)

(1− P

)

m−i

= 1−

∑

i=⌊

n−1

⌋+1





(

|S|

)

(1−

|S|

)

m−i

(3)

where n ≤ |S| (n nodes cannot have more than |S|

variations), and m > ⌊

n−1

⌋ (we need to compromise

at least ⌊

n−1

⌋ + 1 replicas) is the total number of in-

trusion attempts.

Similarly, with k independent diversiﬁcation ap-

proaches, for a sequence of m intrusion attempts,

= 1−

∑

i=⌊

n−1

⌋+1





(

∏

j=1

)

(1−

∏

j=1

)

m−i

(4)

where n ≤

∏

j=1

|, and m > ⌊

n−1

⌋.

In the above discussion, we assumed the ideal sit-

uation where 1) the success of one intrusion attempt is

independent of other attempts; 2) any combination of

diversiﬁcation techniques is valid, and 3) BFT repli-

cation is used. Note that randomization techniques

make the probability breaking into a system to be in-

dependent between attacks in a sequence. However,

the following challenges may signiﬁcantly change the

form of Equation 4. a) Due to the resource limit,

full BFT replication may be not feasible in a heavily

loaded system, where the threshold, ⌊

n−1

⌋, to break

into the system will be different. b) The probability

of success will become accumulative, conditional, or

others, in a sequence of attacks if the diversiﬁcation

is not randomized for each attempt of attack. c) The

characteristics of the attacks, e.g., steps included in

the attack, may change the probability of success be-

tween attempts to be conditional. d) A random com-

bination of different diversiﬁcation techniques does

not automatically lead to a valid and independent de-

fenses, which will change

∏

j=1

| in Equation 4.

For an example, a simple combination of two different

stack randomization techniques may lead to incorrect

stack frames. Thus, there will be many combination

∏

j=1

not valid for deployment. As the result, our

defense space will actually be smaller. Furthermore,

when the space of a diversiﬁcation technique is very

small, multiple replicas may need to share the same

variation, which further weakens the capability of de-

fense.

When all diversiﬁcation are valid, the comparison

of survivability is shown in Figure 4. In the ﬁgure,

BFTSA shows better survivability and the survivabil-

ity gets better when increasing the total number of

nodes.

An interesting thing is that the survivability of

BFTSA is not monotonic, which is due to the charac-

teristics of BFT protocol (Castro and Liskov, 1999).

The BFT protocol can tolerate ⌊

n−1

⌋ compromised

virtual machines in a cluster with n virtual machines.

In the example, when we have 7, 8, or 9 virtual

EVALUATING SURVIVABILITY AND COSTS OF THREE VIRTUAL MACHINE BASED SERVER

ARCHITECTURES

481

0.2

0.4

0.6

0.8

4 5 6 7 8 9 10 11 12 13

Survivability

The number of virtual machines in the system

LBSA and ICSA

BFTSA

Figure 4: Static analysis of survivability with m = 5 and

|S| = 50.

machines in the cluster, the cluster can tolerate the

same number,⌊

n−1

⌋ = 2, of compromised virtual ma-

chines. However, increased number of virtual ma-

chines in the cluster also increases the probability of

successful attack in one attempt. Therefore, if the in-

creasing numberof virtual machines does not increase

⌊

n−1

⌋, the survivability decreases.

5 EVALUATING SURVIVABILITY

UNDER SUSTAINED ATTACKS

In this section, we analyze the behaviors of each ar-

chitecture under sustained attacks.

5.1 State Transition Diagrams

The state transition diagrams of LBSA and ICSA are

shown in Figure 5. In the ﬁgure, the attacker com-

promises the virtual machines one by one with rate λ.

The system can recover virtual machines or services

in the virtual machine (through reboot, micro-reboot,

or other approaches) with rate µ. A state G

, where

0 ≤ i ≤ n, indicates that the system has i virtual ma-

chines not compromised and n − i virtual machines

compromised. In LBSA or ICSA, once a virtual ma-

chine is compromised, the service integrity is gone.

Thus, the only normal state is G

where no virtual

machine is compromised.

The state transition diagrams of BFTSA is shown

in Figure 6. In the ﬁgure, the state and state transi-

tion rates are the same as the one of LBSA and ICSA.

However, since BFTSA has a BFT protocol to elimi-

nate compromised nodes of the cluster, it can tolerate

⌊

n−1

⌋ compromised virtual machines. Therefore, the

normal states are {G

, G

n−1

, . . . , G

n−⌊

n−1

⌋

5.2 The Continuous Time Markov

Chain (CTMC) Model

We assume that λ and µ in Figure 5 and Figure 6

meet the Poisson distribution. Based on the as-

sumptions about parameters and the above discus-

sion, the state transition of our model becomes a ﬁnite

states Continuous-Time Markov Chain (CTMC) (Ti-

jms, 1994; Sahner et al., 1996a) that can be character-

ized by a state transition matrix Q = (q

i, j

) and initial

state probability vector π(0), where q

i, j

is the transi-

tion rate from i to j and q

i,i

= −

∑

j6=i

i, j

The state transition matrix of Figure 5 and Fig-

ure 6 is as follows.

Q =







. . . G

n−1

−µ µ 0 . . . 0 0

λ −λ− µ µ . . . 0 0

. . . .

0 0 0 . . . λ −λ







(5)

The initial state of system is at G

, Thus

π(0) = (0, 0, . . . , 1)

| {z }

(6)

With the CTMC model, both steady state and tran-

sient state can be calculated. The steady state of a

system is the state that all features of the system do

not change any more after running a long period of

time. It may not exist at all for a speciﬁc CTMC.

Fortunately, most real systems do have their steady

states. Once a n by n generator matrix Q is given,

the steady-state probability vector π is determined by

Equation 7.

πQ = 0,

∑

1≤i≤n

= 1. (7)

The comparison of steady states of different ar-

chitectures is shown in Figure 7. In the ﬁgure, X-

axes shows the recovery rate µ and the compromise

rate λ has a ﬁxed value 1. While µ increases from 1

to 10, all architectures demonstrate increasing surviv-

ability. However, BFTSA shows much better surviv-

ability and it is very sensitive to the increment of µ.

The survivability of BFTSA reaches values close to 1

much earlier than LBSA and ICSA. Furthermore, the

survivability of BFTSA is very close to 1 after µ is

greater than 2.

The impact of attack rate λ is shown in Figure 8.

In the ﬁgure, we keep the recovery rate µ at the same

value 5. When we increase the attack rate λ from 1 to

20, the survivability of all architectures decrease. The

BFTSA architecture demonstrates better responses to

the increasing attack rates.

SECRYPT 2010 - International Conference on Security and Cryptography

482

Normal

Compromised

µ µ

n−2

n−1

Figure 5: The state transition diagram of LBSA and ICSA.

Normal

Compromised

µ µ









−⌊

−



⌋−



−⌊

−



⌋







−

Figure 6: The state transition diagram of BFTSA.













     

vability

The recovery rate

LBSA and ICSA

BFTSA

Figure 7: Comparison of steady state survivability while

λ = 1 and n = 10.













 5 10 15 20

Survivability

The attack rate

LBSA and ICSA

BFTSA

Figure 8: Comparison of steady state survivability while

µ = 5 and n = 10.

The CTMC model can also tell the transient be-

haviors of a system. A transient state of a system is

the state of the system at a speciﬁc moment. The eval-

uation of transient states shows how quickly a system

goes to its steady state, and how much time is spent

on each state. A system may satisfy us with its steady

states but disappoint us with its transient behaviors,

e.g. taking too long to enter the steady state. Tran-

sient behaviors also tell us what may happen if a sys-

tem suffers a short term of high attacking rate.

Given a generator matrix Q and initial state proba-

bility vector π(0), transit state probability π(t) at time

t is determined by Equation 8.

π(t) = π(t)Q (8)

In Figure 9, we show the transient survivability

of LBSA and ICSA. In the ﬁgure, when time in-

0.75

0.8

0.85

0.9

0.95

0 1 2 3 4 5 6 7 8

Survivability

Time

Figure 9: The transient behaviors of LBSA and ICSA while

λ = 1, µ = 4, and n = 10.

creases from 0 to 8, the survivability decreases fastly

to around 0.75. In other words, we can see that the

system enters the steady state very quick.

The transient behaviors of BFTSA is shown in

Figure 10. According to the ﬁgure, while the steady

EVALUATING SURVIVABILITY AND COSTS OF THREE VIRTUAL MACHINE BASED SERVER

ARCHITECTURES

483

0.999997

0.999998

0.999999

0 1 2 3 4    



Figure 10: The transient behaviors of BFTSA while λ = 1,

µ = 4, and n = 10. Y-axes is survivability.

state survivability is higher than LBSA and ICSA, the

system enters the steady state slower.

Given the state transition matrix, when needed,

the cumulative time l(t) spent on each state at time

t is given by Equation 9.

l(t) = l(t)Q + π(0) (9)

An example of accumulative time spent on the “nor-

mal” state of the LBSA architecture is shown in Fig-

ure 11.



1.5

2.5

3.5

0 0.5 1 1.5 2 2.5 3 3.5 4

Accumulative time at the normal states

Time

Figure 11: The accumulative time spent on the normal state

of LBSA while λ = 1, µ = 5, and n = 10.

6 EVALUATING THE COSTS

In this section, we discuss the costs of each architec-

ture considering the processing costs, memory costs,

and communication costs.

Table 1: Evaluation metrics comparison of three architec-

tures.

Metrics LBSA ICSA BFTSA

Processing costs n 1 n

Memory costs n 1 n

Communication costs O(1) O(1) O(n

)

Intrusion tolerance 0 0 ⌊

n−1

⌋

Fail-safe fault tolerance n−1 0 ⌊

n−1

⌋

Both LBSA and BFTSA need diversiﬁed replica-

tions. When we replicate the services in virtual ma-

chines, the processing costs and memory needed by

replication will be n times more than the needs of

ICSA.

When we consider the communication costs, for

each request, ICSA does not duplicate and forward

the request. Similarly, while the load balancer in

LBSA forwards the request to lightly loaded virtual

machine, it does not duplicate the request to other

virtual machines either. However, BFTSA requires

a more complex communication protocol. The BFT

protocol (Castro and Liskov, 1999) goes three phases

and in each phase, one or all nodes in the cluster needs

to broadcast to others. The communication complex-

ity of BFT is O(n

) (Castro and Liskov, 1999).

We summarize the costs and the capability of in-

trusion tolerance or fault tolerance in Table 1. In the

table, intrusion tolerance indicates how many com-

promised nodes can be tolerated by a speciﬁc archi-

tecture. The fail-safe fault tolerance indicates how

many fail-safe nodes can be tolerated, where fail-safe

nodes do not demonstrate arbitrary behaviors as a

compromised node does under attacks.

7 CONCLUSIONS

In this paper, we compared the survivability and costs

of three virtual machine based architectures. Our

studies show that even with the same COTS software,

a different architecture can have signiﬁcant impact

on the survivability of the whole system. According

to our analytical results, replicated architecture with

BFT protocol and diversiﬁcation is much better than

simple replication and isolation with regard to surviv-

ability. However, the costs of BFT protocol is high.

The analytical methods described in this paper, such

as the static analysis and the dynamic CTMC based

analysis, can be used to analyze the survivability of

other architectures. The results of this paper can also

be used as guidelines in architecture design when sur-

vivability is crucial to the system.

SECRYPT 2010 - International Conference on Security and Cryptography

484

ACKNOWLEDGEMENTS

We thank Dr. Yan Yang for the discussions of some

contents of the paper. We also thank all anonymous

reviews comments which helped us to greatly im-

prove the quality of the paper. Meng Yu was sup-

ported by NSF grant CNS-0905153. Peng Liu was

supported by NSF CNS-0905131, AFOSR FA9550-

07-1-0527 (MURI), and ARO MURI: Computer-

aided Human Centric Cyber Situation Awareness.

REFERENCES

Alvisi, L., Malkhi, D., Pierce, E., and Reiter, M. K. (2001).

Fault detection for byzantine quorum systems. IEEE

Transactions on Parallel and Distributed Systems,

12(9):996–1007.

Bernstein, P. A., Hadzilacos, V., and Goodman, N. (1987).

Concurrency Control and Recovery in Database Sys-

tems. Addison-Wesley, Reading, MA.

Castro, M. (2001). Practical Byzantine Fault Tolerance.

PhD thesis, Department of Electrical Engineering and

Computer Science, Massachusetts Institute of Tech-

nology. Also as Technical Report MIT/LCS/TR-817.

Castro, M. and Liskov, B. (1999). Practical byzantine fault

tolerance. In The Third Symposium on Operating Sys-

tems Design and Implementation (OSDI ’99), pages

173–186, New Orleans, USA.

Chun, B.-G., Maniatis, P., and Shenker, S. (2008). Diverse

replication for single-machine byzantine-fault toler-

ance. In ATC’08: USENIX 2008 Annual Technical

Conference on Annual Technical Conference, pages

287–292, Berkeley, CA, USA. USENIX Association.

Gokhale, S. S., Vandal, P. J., and Lu, J. (2006). Perfor-

mance and reliability analysis ofweb server software

architectures. In PRDC ’06: Proceedings of the 12th

Paciﬁc Rim International Symposium on Dependable

Computing, pages 351–358, Washington, DC, USA.

IEEE Computer Society.

Jajodia, S. and Mutchler, D. (1990). Dynamic voting algo-

rithms for maintaining the consistency of a replicated

database. ACM Trans. Database Syst., 15(2):230–280.

Kotla, R., Alvisi, L., Dahlin, M., Clement, A., and Wong,

E. (2007). Zyzzyva: speculative byzantine fault toler-

ance. SIGOPS Oper. Syst. Rev., 41(6):45–58.

Malkhi, D. and Reiter, M. (1998). Byzantine quorum sys-

tem. Distributed Computing, 11(4):203–213.

Marsan, M. A. (1990). Stochastic Petri nets: an elementary

introduction, pages 1–29. Springer-Verlag New York,

Inc., New York, NY, USA.

Mauw, S. and Oostdijk, M. (2005). Foundations of at-

tack trees. In International Conference on Information

Security and Cryptology ICISC 2005. LNCS 3935,

pages 186–198. Springer.

Nicol, D. M., Sanders, W. H., and Trivedi, K. S. (2004).

Model-based evaluation: From dependability to secu-

rity. IEEE Transactions on Dependable and Secure

Computing, 1(1):48–65.

Padilla, G., Gao, T., Yen, I.-L., Bastani, F., and de Oca,

C. M. (2008). An early reliability assessment model

for data-ﬂow software architectures. Mexican Inter-

national Conference on Computer Science, 0:9–19.

Sahner, R. A., Trivedi, K. S., and Puliaﬁto, A. (1996a). Per-

formance and Reliability Analysis of Computer Sys-

tems. Kluwer Academic Publishers, Norwell, Mas-

sachusetts, USA.

Sahner, R. A., Trivedi, K. S., and Puliaﬁto, A. (1996b).

Performance and reliability analysis of computer sys-

tems: an example-based approach using the SHARPE

software package. Kluwer Academic Publishers, Nor-

well, MA, USA.

Sanders, W. H., S, W. H., and Meyer, J. F. (2001). Stochastic

activity networks: Formal deﬁnitions and concepts.

Sawilla, R. E. and Ou, X. (2008). Identifying critical at-

tack assets in dependency attack graphs. In ESORICS

’08: Proceedings of the 13th European Symposium on

Research in Computer Security, pages 18–34, Berlin,

Heidelberg. Springer-Verlag.

Schneider, F. B. (1990). Implementing fault tolerant ser-

vices using the state machine approach: A tutorial.

ACM Computing Surveys, 22(4).

Seguin, J., Sergeant, G., and Wilms, P. (1979). A major-

ity consensus algorithm for the consistency of dupli-

cated and distributed information. In IEEE Interna-

tional Conference on Distributed Computing Systems,

pages 617–624, New York.

Tijms, H. C. (1994). Stochastic Models. Wiley series in

probability and mathematical statistics. John Wiley &

Son, New York, NY, USA.

EVALUATING SURVIVABILITY AND COSTS OF THREE VIRTUAL MACHINE BASED SERVER

ARCHITECTURES

485