Power-saving Design in Server Farms for Multi-tier Applications under
Response Time Constraint
Shengquan Wang
1
, Waqaas Munawar
2
, Xue Liu
3
and Jian-Jia Chen
2
1
Department of Computer and Information Science, Univ. of Michigan-Dearborn, Dearborn, U.S.A.
2
Department of Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
3
School of Computer Science, McGill University, Montreal, Canada
Keywords:
Power Saving, Server Farm, Multi-tier, Response Time, Service Level Agreement (SLA), M/G/1/PS, Mean
Value Analysis (MVA).
Abstract:
Server farms suffer from an increasing power consumption nowadays. Power saving has become a prominent
design issue in server farms. This paper presents a power-saving design in server farms under the constraint
of the response time. In particular, we target on multi-tier applications, which are very typical on the web in
modern days. We propose an efficient power-saving design strategy, called
PowerTier
. This strategy exploits
two major techniques by using Dynamic Power management (DPM) to activate/deactivate servers and using
Dynamic Voltage Scaling (DVS) to adjust the processor speed for each activated server. In addition,
PowerTier
considers two different application models: the open-queueing model and the closed-queueing model for
session-less and session-based web applications respectively. With
PowerTier
, we are able to choose the
number of activated servers at each tier and the processor speed for each server to minimize the overall power
consumption in server farms while meeting a given mean response time guarantee for multi-tier applications.
Our comprehensive simulation confirms the effectiveness and efficiency of
PowerTier
.
1 INTRODUCTION
Power has become one of the most dominant oper-
ating cost in server systems. By 2011, data centers
in U.S. are expected to consume around 100 billion
kW per year (U.S. Environmental Protection Agency
(EPA), 2007), in which the annual power cost is
around 7.4 billion US$. Moreover, given the large
number of servers in use today, the worldwide ex-
penditure on enterprise power and cooling of these
servers is estimated to be in excess of 30 billion
US$ (Raghavendra et al., 2008). At the same time,
guaranteeing the performance-oriented Service Level
Agreement (SLA) signed with the clients is critical
to clients’ satisfaction for online business. A com-
mon SLA is defined as (the mean value of) the re-
sponse time constraint, and a delayed response to
clients will have negative effects on online business
including client frustrations and revenue loss. In or-
der to meet a satisfying SLA for an increasing service
demand from clients, server farm becomes a common
practice in industry. A server farm could be composed
of a cluster of tens to thousands of servers to provide
large computing capability. However, the power con-
sumption in server farms is tremendous.
Recently, an increasing attention has been paid on
how to reduce the power consumption while main-
taining a given SLA. In (Rusu et al., 2006), a timing-
aware power management scheme was proposed that
combines cluster-wide, server on/off scheme and lo-
cal power management techniques in heterogeneous
clusters. A new threshold-based approach was pre-
sented in (Wang and Lu, 2008) for efficient power
management of heterogeneous soft real-time clus-
ters in making three important design decisions on
ordered server list, server activation thresholds and
workload distribution. In (Guerra et al., 2008),
a queueing theoretical technique was proposed to
balance energy consumption and adequate applica-
tion response times in heterogeneous CPU-intensive
server clusters. In (Bohrer et al., 2002; Sharma et al.,
2003), low-power opportunities for web servers has
been utilized to reduce the energy consumption by
applying Dynamic Voltage Scaling (DVS) with min-
imal impact on the server performance. A queueing
model was used in (Gandhi et al., 2009) to predict
the optimal power allocation in a variety of scenarios
with DVS and Dynamic Power Management (DPM)
137
Wang S., Munawar W., Liu X. and Chen J..
Power-saving Design in Server Farms for Multi-tier Applications under Response Time Constraint.
DOI: 10.5220/0004357201370148
In Proceedings of the 2nd International Conference on Smart Grids and Green IT Systems (SMARTGREENS-2013), pages 137-148
ISBN: 978-989-8565-55-6
Copyright
c
2013 SCITEPRESS (Science and Technology Publications, Lda.)
by activation/deactivation of servers in both closed-
queueing and open-queueing models. In (Wierman
et al., 2009), an optimal speed scaling was proposed
to balance the mean energy consumption and mean
response time under Processor Sharing (PS) schedul-
ing.
All of these work focused on simple single-tier
applications. As we know, modern web applications
are usually using multi-tier architecture (Kamra et al.,
2004; Liu et al., 2006; Liu et al., 2008; Urgaonkar
et al., 2005; Pacifici et al., 2005; Diao et al., 2006;
Liu et al., 2005; Wang et al., 2010). Each tier pro-
vides a certain specific functionality to applications.
A client request will pass through a series of tiers
to attain a complete service. For instance, a typ-
ical e-commerce application consists of three tiers:
web server tier, application server tier, and database
server tier. The multi-tier architecture follows lay-
ered queueing models (Rolia and Sevcik, 1995) and
typically has a cross-tier dependency. The service at a
tier is normally blocked while waiting for the service
from its succeeding tier. Such cross-tier dependency
makes the response time analysis challenging in com-
parison with the single-tier architecture. In (Liu et al.,
2005), an analytical model was proposed for 3-tiered
web service architecture. The concurrency limit was
addressed in (Urgaonkar et al., 2005) for multi-tier
applications. In (Pacifici et al., 2005), an architecture
and underlying model of a performance management
system was presented for multi-tier web applications
on server clusters. In (Diao et al., 2006), a hybrid per-
formance model for differentiated services was pre-
sented for multi-tier applications with cross-tier inter-
action. Among these work, only in (Diao et al., 2006)
the cross-tier dependencywas considered, and the rest
applied a tandem-queue-likestructure and ignored the
cross-tier dependency. In (Wang et al., 2010), an over-
simplified M/M/1 model is used to perform the queue-
ing analysis in multi-tier architecture.
In this paper, we aim to conduct a comprehen-
sive study on power saving in server farms for multi-
tier applications requiring that a given SLA should
be met. We adopt two techniques as used in server
farms for power-aware design: the DVS technique
with variable speeds and the DPM technique with ac-
tivation/deactivation of servers. There are two spe-
cific questions that we need to address in the system
design in order to achieve this goal: (i) How many
servers should be activated at each tier? (ii) What is
the best processor speed (corresponding to the volt-
age/frequency to be used) for each server? For a
single-tier architecture with homogeneous servers, it
is shown in (Gandhi et al., 2009) that the optimal
strategy is to set all servers with the same speed for
all activated servers. However, this does not hold in
the multi-tier architecture due to the cross-tier depen-
dency. We present an efficient power-saving design
strategy called
PowerTier
and study how to choose
the number of activated servers at each tier and the
processor speed for each server to minimize the over-
all power consumption in server farms while meeting
a given mean response time guarantee for multi-tier
applications. We consider both open-queueing and
closed-queueing models for applications.
The rest of this paper is organized as follows: Sec-
tion 2 shows the system model. Section 3 presents a
detailed power consumption and response time anal-
ysis, which is the basis for our power-saving design.
Our power-saving design scheme
PowerTier
is de-
scribed in Section 4. Section 5 presents detailed per-
formance evaluation of
PowerTier
over a various of
platforms. We conclude the paper in Section 6.
2 SYSTEM MODEL
In this section, we define the system model, including
the power consumption model for servers, the multi-
tier architecture, and the client application model.
2.1 Power Consumption Model
We assume that all servers are equipped with the
DVS and DPM techniques for the power manage-
ment. When the server is deactivated by the DPM
technique, its power consumption is negligible. So,
here we focus on the power consumption when the
server is activated.
With the DVS technique, we can choose a proces-
sor speed for a server (with a corresponding choice of
the supply voltage). We define r as the ratio of the
processor speed of the server to its maximum speed.
The speed ratio r is normally bounded by a lower
bound r
l
. Then we haver
l
r 1. When the server is
activated, either it is (i) in the idle mode at the lowest
speed ratio r
l
without executing any job; or (ii) in the
running mode executing jobs with a processor speed
ratio r. The power consumption in our study is the
system-level power, including the power consumed
by the processor and all other components within the
server such as memory and I/O devices. The power
consumption depends on the mode that the server is
in (idle or running) and the processor speed in use as
well. In this paper, we adopt the power consumption
model in (Gandhi et al., 2009). A server has the fol-
lowing power modes:
Idle Power Mode. In the idle mode, the server
consumes the static power P
I
;
SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems
138
Running Power Mode. In the running mode, the
power consumption P
R
(r) by the server at a speed
ratio r is
P
R
(r) = α[r r
l
]
γ
+ P
I
, (1)
where γ 1. The cubic rule is widely suggested
in the literature for the processor power-to-speed
relationship in the running mode, i.e., γ = 3. How-
ever, in server farms with DVS or for some appli-
cations, the linear rule could be applied (Gandhi
et al., 2009).
The different power modes provide the space for
system designers to design efficient power-saving
strategies.
2.2 Multi-tier Architecture
We consider a system with M tiers, each of which
consists of a server farm. We assume that all tiers
run on homogeneous servers. Tier m has v
m
activated
servers and each runs at a speed ratio r
m
. The homo-
geneous sever assumption was used in previous simi-
lar studies (Gandhi et al., 2009).
1
We assume a pro-
cessor sharing (PS) scheduling at each server, since it
approximates well the scheduling algorithms used by
most commodity operating systems such as Linux.
Tier 1 Tier 2! Tier 3!
server
Figure 1: A three-tier architecture with 3 servers at Tiers 1,
7 servers at Tier 2, and 6 servers at Tier 3.
Figure 1 illustrates a three-tier architecture. A
client starts a request at the outer tier, denoted as Tier
1, and then goes to an inner Tier m (m 2) one by one
if necessary. When a request arrives at Tier m, it trig-
gers one or more requests at its succeeding Tier m+1.
After some processing at Tier m, it either returns to
Tier m 1 or proceeds to Tier m+ 1. The exceptions
are the last Tier M, where all requests return to the
Tier M 1, and the first Tier 1, where returning to
the preceding queue means request completion. This
1
We can extend it to the heterogeneous case by taking
into consideration the different characteristics of servers.
model can handle multiple visits to a tier including se-
quential and parallel accesses (Diao et al., 2006). We
denote κ
m+1
as the average request visit ratio at Tier
m+ 1 by a request at Tier m. If we define λ
m
as the
request arrival rate at Tier m
2
, then we have
κ
m+1
=
λ
m+1
λ
m
. (2)
Tier m
Tier m+1
λ
!
Server farm
1
2
v
m
dispatcher
Figure 2: Cross-tier dependency.
The system supports many clients. At each tier, a
dispatcher (or load balancer) as shown in Figure 2 will
distribute all incoming requests from clients to one of
the servers in the server farm of this tier. We assume
that they are evenly distributed to the servers at each
tier. The multi-tier architecture shows a cross-tier de-
pendency (Diao et al., 2006; Rolia and Sevcik, 1995).
It could be illustrated with a nested structure in Fig-
ure 2, where Tier m includes the succeeding Tierm+1
for m < M. A request can only be completed at each
tier after it has received service from the succeeding
tier (if it needs service from the succeeding tier). We
assume that the waiting for the outcome from the suc-
ceeding tier is non-blocking, i.e., the waiting process
will not block other processes in the same server from
using the resources such as CPU during its waiting.
2.3 Client Application Model
Applications for clients could be session-less or
session-based. Each session-less application issues
one request during its life while a session-based ap-
plication usually issues more requests during its life-
time with think times in between and normally lasts
for a while. The former can be modeled as an open-
queueing system and the latter as a closed-queuing
system (Jain, 1991). In an open-queueing system, ap-
plications start and end even though we could assume
a fixed average arrival rate, but in a closed-queueing
system, applications stay and the number of the ses-
sions remains the same. Figure 3 shows open/closed-
queueing models at Tier 1. For the closed-queueing
model, we define N
1
delay servers, each of which cor-
responds to the think time for each session at Tier 1
2
In the closed-queueing model introduced later, λ
1
is de-
fined as throughput since the number of sessions is fixed.
Power-savingDesigninServerFarmsforMulti-tierApplicationsunderResponseTimeConstraint
139
Tier 1
1
2
N
1
Tier 1
λ
1
!"#$%&'()%)%*+,-
./%+'()%)%*+,-
Figure 3: Open-queueing and closed-queueing models at
Tier 1.
(Liu et al., 2005).
In the multi-tier architecture, the system could
be decoupled into subsystems by tiers. In either
open-queueing or closed-queueing model, at Tier m
(m 2), none, one, or more requests could be gen-
erated with ratio κ
m
after the proceeding tier. In the
study of server performance, M/G/1/PS server model
has been shown by different research studies that can
model the open-queueing servers well (Kamra et al.,
2004; Wang and Lu, 2008; Gandhi et al., 2009; Wier-
man et al., 2009; Liu et al., 2008; Liu et al., 2006;
Heo et al., 2007). Mean Value Analysis (MVA) is
widely used in closed-queueing servers (Lazowska
et al., 1984; Jain, 1991; Reiser and Lavenberg, 1980)
for response time analysis.
3 POWER CONSUMPTION AND
RESPONSE TIME ANALYSIS
Recall that our objective is to minimize the power
consumption under the mean response time con-
straint. First we need to conduct the power consump-
tion and response time analysis. We start it with single
server, then we extend it to server farm in a multi-tier
architecture.
3.1 Single Server
Power Consumption. We consider a server that
will switch in two power modes alternatively: running
and idle. We define π
R
and π
I
as the probabilities that
the server is in running and idle mode respectively,
where π
R
+ π
I
= 1, then the mean power consump-
tion can be written as:
E[P] = P
R
(r)π
R
+ P
I
π
I
= α[r r
l
]
γ
π
R
+ P
I
, (3)
where P
R
(r) is defined in (1). According to (3), in
order to calculate the mean power, we need to obtain
the value of π
R
:
For an open-queueing M/G/1/PS server, if re-
quests are with an arrival rate λ, and a general-
ized service time distribution with a given mean
value E[S], then by the traditional queueing the-
ory (Kleinrock, 1976) we have
π
R
= λE[S]. (4)
For a closed-queueing server, we could use the
same formula in (4) to calculate π
R
. However λ
in (4) should be throughput instead
3
. Assume
that there are N fixed number of sessions, jobs are
with a mean response time E[R], and a mean think
time E[Z] in between, then by (Jain, 1991), the
throughput λ can be obtained as
λ =
N
E[R] + E[Z]
, (5)
where E[R] is to be determined.
Response Time Analysis. The response time is de-
fined as the time spent by a job in waiting in the queue
and executing on the processor.
For an open-queueing M/G/1/PS server, by the
traditional queueing theory (Kleinrock, 1976), the
mean response time is
E[R] =
E[S]
1 λE[S]
. (6)
For a closed-queueing server, we use MVA to ob-
tain the mean response time (Jain, 1991). We de-
fine D(N) as the mean delay for N sessions. D
can be calculated recursively in terms of N. If we
denote Q(N) as the mean queue length with N ses-
sions, then with MVA we have
D(N) = [Q(N 1) + 1]E[S], (7)
Q(N) =
N
E[R] + E[Z]
D(N), (8)
where Q(0) = 0. To reduce the computa-
tion complexity, we could use the well-accepted
Schweitzer’s approximation (Jain, 1991) by ap-
proximating Q(N 1)
N1
N
Q(N) to avoid the
recursive computation. Then D(N) will be the
positive solution to the following equation:
D(N) =
[N 1]D(N)
E[R] + E[Z]
+ 1
E[S], (9)
where the response time E[R] = D(N) for single
tier.
The analysis in single server regarding power con-
sumption and response time will be the basis for
server farms in a multi-tier architecture.
3
The throughput is equivalent to the arrival rate in open-
queueing model and so we use the same notation λ.
SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems
140
3.2 Server Farm in Multi-tier
Architecture
In the following, we consider the multi-tier architec-
ture with each tier having multiple servers. We add
a subscript m into the terms introduced above for any
server at Tier m.
We assume that the average service demand is
fixed in the system, which is the arrival rate λ
1
and
the number of sessions N
1
at Tier 1 for open/closed-
queueing models respectively. By (5), the throughput
at Tier 1 in closed-queueing model can be written as
λ
1
=
N
1
E[R
1
] + E[Z]
, (10)
where E[R
1
] is to be determined.
At Tier m (m = 1, 2, . . . , M), given the visit ratio
κ
m
and the number v
m
of the servers, the request ar-
rival rate can be obtained as
λ
m
= λ
1
[
m
i=1
κ
i
]. (11)
Recall that Tier m has v
m
homogeneous servers. Since
the arrival is evenly distributed to each server at each
tier, the arrival rate at any server at Tier m is
λ
m
v
m
.
Power Consumption. In order to calculate the
mean power for servers at each tier, we need to find
π
R,m
, i.e., the probability that a server at Tier m is in
the running mode. We assume that the server is in the
idle power mode when it is waiting for the outcome
from the succeeding tier. For either an open-queueing
or closed-queueing server at Tier m, we have
π
R,m
=
λ
m
v
m
E[S
m
], (12)
where λ
m
is defined in (11). The value of λ
1
in (11)
is given for an open-queueing server and defined by
(10) for a closed-queueing server respectively. Then
applying (12) into (3), we can obtain the mean power
consumption.
Response Time Analysis. In multi-tier architec-
ture, the response time analysis is more complex due
to the cross-tier dependency. The response time at
each tier also includes the waiting time for the out-
come from the succeeding tier, which is κ
m+1
E[R
m+1
]
at Tier m (Diao et al., 2006). Recall that the wait-
ing for the outcome from the succeeding tier is non-
blocking. Then applying this observation into (6) and
(7), we have the following results:
For an open-queueing M/G/1/PS server at Tier m,
the response time is
E[R
m
] =
E[S
m
]
1
λ
m
v
m
E[S
m
]
+ κ
m+1
E[R
m+1
], (13)
where λ
m
is defined in (11).
For a closed-queueing server at Tier m, we denote
D
m
as the mean delay experienced by any server at
Tier m. The hit ratio for any session at any server
at Tier m is [
m
j=1
κ
j
]
v
m
v
1
(Urgaonkar et al., 2005).
Then with the approximated MVA in (9), the re-
sponse time for any request at Tier m is
E[R
m
] = D
m
+ κ
m+1
E[R
m+1
], (14)
which satisfies
D
m
=
"
[N
1
1][
m
j=1
κ
j
]
v
m
v
1
D
m
E[R
1
] + E[Z]
+ 1
#
E[S
m
].
(15)
We assume the mean service time for a server at
Tier m is E[S
m
] =
1
µ
m
under the maximum speed. If
the server runs at a speed ratio r
m
in the running mode,
we have E[S
m
] =
1
µ
m
r
m
. Applying this into the above
results, we have the complete analysis of power con-
sumption and response time for server farms in multi-
tier architecture. It is summarized in the following
theorem:
Theorem 1. We consider both open/closed-queueing
models for server farms in multi-tier architecture. For
either an open-queueing or closed-queueing server at
Tier m, the mean power consumption is
E[P
m
] =
λ
m
v
m
µ
m
α[r
m
r
l
]
γ
r
m
+ P
I
, (16)
where λ
m
is defined in (11). The mean response time
of a job can be obtained as follows:
For an open-queueing M/G/1/PS server at Tier m,
the mean response time of a job is
E[R
m
] =
1
µ
m
r
m
λ
m
v
m
+ κ
m+1
E[R
m+1
], (17)
for m = 1, 2, . . . , M with R
M+1
= 0.
For a closed-queueing server at Tier m, the mean
response time of a job is
E[R
m
] = D
m
+ κ
m+1
E[R
m+1
], (18)
which satisfies
D
m
=
"
[N
1
1][
m
j=1
κ
j
]
v
m
v
1
D
m
E[R
1
] + E[Z]
+ 1
#
1
µ
m
r
m
.
(19)
In the above formulas, λ
1
and N
1
are given for
open/closed-queueing server respectively.
We notice that in closed-queueing model, D
m
s
depends on each other by (18) and (19). Such
inter-dependency among D
m
s generates an implicit
formula for the response time analysis in closed-
queueing model. In such case, we could use the clas-
sical fixed point theorem to solve it.
Power-savingDesigninServerFarmsforMulti-tierApplicationsunderResponseTimeConstraint
141
4
PowerTier
: AN EFFICIENT
POWER-SAVING DESIGN
In this section, we will study our power-saving design
strategy
PowerTier
for server farms in multi-tier ar-
chitecture. The power-saving design could be divided
into two phases:
Planning or upgrading: In this phase, we could
determine the number of servers (denoted as ˆv
m
)
needed for the peak service demand which is
defined as the maximum arrival rate and the
maximum number of sessions at Tier 1 for
open/closed-queueing models respectively (de-
noted as
ˆ
λ
1
and
ˆ
N
1
respectively).
Runtime: In this phase, the number of the avail-
able servers are fixed, which were obtained in the
planning or upgrading phase. For any level of ser-
vice demand, we will decide how many servers
should be activated/deactivated and the processor
speed for each server.
In each phase, we assume that the visit ratio κ
m
s
and the service rate µ
m
s were learned through the
monitored history, and so they are known in the de-
sign. The major difference between these two phases
is that we can switch servers among different tiers in
the planning or upgrading phase, but not practically
in the runtime phase.
In our design, we aim to minimize the mean power
consumption under a given mean response time con-
straint, where we choose a mean responsetime thresh-
old
ˆ
R. For given service demand λ
1
and N
1
(in
open/closed-queueing model respectively), we need
to determine the number of the activated servers at
each tier and the processor speed of each activated
server, i.e., the value of v
m
and r
m
for m = 1, 2, . . . , M.
The optimization problem can be formulated as fol-
lows:
min
{v
m
,r
m
:m=1,2,...,M}
E[P] =
M
m=1
v
m
E[P
m
] (20a)
subject to E[R] = E[R
1
]
ˆ
R, (20b)
max{r
l
,
λ
m
v
m
µ
m
} r
m
1, (20c)
1 v
m
ˆv
m
. (20d)
Inequality (20c) is based on the bound of r
m
(r
l
r
m
1) and the stability condition of a server (r
m
>
λ
m
v
m
µ
m
). Inequality (20c) is the server availability con-
straint. Notice that we will only have the lower bound
in the planning or upgrading phase.
4
4
In the planning or upgrading phase, if the number of
the overall available servers are limited due to budget, we
need to add this additional constraint too.
In order to solve the above optimization with non-
linear constraints, we first treat v
m
as continuous val-
ues to deal with the integer programming. Since all
E[P
m
] and E[R
m
] are convex functions in terms of
r
m
and v
m
, the optimization problem can be solved
with the Lagrangian method. We define the Lagrange
equation L with the Lagrange multipliers φ and χ
k,m
(k = 1, 2, 3 and m = 0, 1, . . . , M) as
L =
M
m=1
v
m
E[P
m
] + φE[R
1
] +
M
m=1
[χ
1,m
[r
m
r
l
][r
m
1]
χ
2,m
[v
m
1][v
m
ˆv
m
] χ
3,m
r
m
v
m
µ
m
], (21)
where all multipliers are non-negative.
We set
L
r
m
= 0 and
L
v
m
= 0 for m = 1, 2, . . . ,M.
Together with equalities in (13) and the following
equalities
φ[E[R
1
]
ˆ
R] = 0, (22)
χ
1,m
[r
m
r
l
][r
m
1] = 0, (23)
χ
2,m
[v
m
1][v
m
ˆv
m
] = 0, (24)
χ
3,m
[λ
m
r
m
v
m
µ
m
] = 0, (25)
we can solve the values of the variables.
The key in the above equations is to solve
L
r
m
= 0
and
L
v
m
= 0 for m= 1, 2, . . ., M. Given the power con-
sumption and response time analysis summarized in
previous section for both open/closed-queueing mod-
els, we can obtain the formula.
We denote r
m
and v
m
as the obtained optimal r
m
and v
m
respectively. Since v
m
s are integer values, we
could choose their rounded-up values as the final out-
put. Then apply the rounded-up v
m
into the above op-
timal approach with fixed server allocation and obtain
the final solution.
We summarize the main result in the following
theorem:
Theorem 2. Given κ
m
s, µ
m
s, λ
1
(in the open-
queueing model) and N
1
(in the closed-queueing
model), with the
PowerTier
design we could nd
the (heuristically) optimal values r
m
and v
m
(m =
1, 2, . . ., M) such that the power consumption can be
minimized as E[P
] while the mean response time
E[R
1
] is below the mean response time threshold
ˆ
R.
With the
PowerTier
design, we determine server
allocation at each tier for the planning or upgrading
phase for the peak service demand. Also, for the
running phase, for different service demands and the
visit ratios, we also determine the number of acti-
vated/deactivated servers at each tier and the proces-
sor speed for each activated server. With the
Pow-
erTier
design, we are able to minimize the mean
power consumption under the given mean response
time constraint.
SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems
142
5 PERFORMANCE EVALUATION
In this section, we first verify the queueing analysis in
multi-tier architecture in both open/closed-queueing
model as shown in Section 3. Then we present perfor-
mance evaluation of our proposed
PowerTier
design.
5.1 Verifying Queueing Analysis in
Multi-tier Architecture
We employed the RUBiS (Cecchet et al., 2002) sys-
tem to generate and analyze the traffic for measur-
ing the timing parameters. The RUBiS system offers
an eBay like web service with an associated client
to generate the test traffic. It is a three-tier hierar-
chical system with a possibility of having more than
one server per tier. First tier consists of Apache
load balancer, the second includes the JBoss applica-
tion server and the third consists of MySQL database
server. We allocated one physical machine per tier.
For the purpose of evaluation, we modified two as-
pects of RUBiS’ test traffic generator: (i) it was al-
tered to record a complete trace of the requests and
wrote an additional tool to replay the saved trace. This
helps negate the effects of any probabilistic fluctua-
tions arising from differences in the traces. (ii) Sec-
ondly, we instrumented the RUBiS server side code
to measure the inter tier request ratios and per tier re-
sponse times. Overall, this architecture to measure
the empirical results, insured accurate emulation of
an actual typical multi-tier web service.
We obtained the mean service time at the
maximum speed at each tier as (
1
µ
1
,
1
µ
2
,
1
µ
3
) =
(1.2, 6.86, 5.43) ms, and the visit ratio as
(κ
1
, κ
2
, κ
3
) = (1, 1, 1.24074), and adopted the think
time 0.035 sec in the closed-queueing model, which
is used in (Liu et al., 2005). The available frequencies
for each server are: the Apache server 3.0GHz
and 2.8GHz; the JBoss server 3.1, 3.0, . . . , 1.6GHz;
the MySQL server 2.13GHz, 1.87GHz, and
1.60GHz. We consider two kinds of configurations
of frequencies at tiers: (3.0, 3.1, 2.13)GHz and
(2.8, 2.5, 1.6)GHz.
We have run over 5, 000 requests and fixed the ar-
rival rate λ
1
= 8.862, 17.723 per second for the open-
queueing model. We measure the mean response time
for each request at each tier and compare it with the
modeled one. The results are shown in Table 1. In all
cases, the maximum error of measured response time
compared with the modeled one is 4% and most of
them are pretty small. It shows that the modeling is
pretty accurate.
Table 1: Comparison of Measured/Modeled Response Time
in Open-Queueing Model.
(a) λ
1
= 8.862 and (r
1
, r
2
, r
3
) = (1, 1,1)
E[R
1
] E[R
2
] E[R
3
]
Measured 16.50 ms 14.72 ms 5.83 ms
Modeled 15.93 ms 14.55 ms 5.83 ms
Error 3.6% 1.2% 0.0%
(b) λ
1
= 17.723 and (r
1
, r
2
, r
3
) = (1, 1,1)
E[R
1
] E[R
2
] E[R
3
]
Measured 18.27 ms 16.49 ms 6.71 ms
Modeled 17.72 ms 16.14 ms 6.71 ms
Error 3.1% 2.2% 0.0%
(c) λ
1
= 8.862 and (r
1
, r
2
, r
3
) = (0.90, 0.81, 0.75)
E[R
1
] E[R
2
] E[R
3
]
Measured 18.27 ms 16.49 ms 6.71 ms
Modeled 17.71 ms 16.37 ms 6.71 ms
Error 3.2% 0.8% 0.0%
(d) λ
1
= 17.723 and (r
1
, r
2
, r
3
) = (0.90, 0.81,0.75)
E[R
1
] E[R
2
] E[R
3
]
Measured 21.38 ms 19.63 ms 8.24 ms
Modeled 20.86 ms 18.89 ms 8.24 ms
Error 2.5% 4.0% 0.0%
5.2 Evaluating
PowerTier
Servers. We use three types of servers to evalu-
ate the performance of
PowerTier
: the one as a JBoss
server in the previous experiment, and the other two
from (Gandhi et al., 2009). The one in the previous
experiment is an Intel i3 dual core processor based
server with 6GB of RAM. The server is equipped
with DVFS capability with option to switch between
16 frequencies between 1.6GHz and 3.1GHz. We
enable one core of each server. We measure the
power consumption for the one in the previous ex-
periment. Based on the measured power, we use
Least Squares Fitting to obtain the modeled power
as shown in Figure 4. The power profiles of all
three types of servers in Table 2, where Type-B is
the one in the previous experiment. We consider
the same three-tier architecture used in the previous
experiment. We adopt from (Liu et al., 2005) the
mean service time at the maximum speed at each tier
with (
1
µ
1
,
1
µ
2
,
1
µ
3
) = (1.2, 15.1, 36.7) ms, the visit ra-
tio (κ
1
, κ
2
, κ
3
) = (1, 0.998, 1.603), and the think time
with 0.035 sec in the closed-queueing model.
We scale the job service time for all three type
servers so that the mean service time at the maximum
speed at each tier follows the above specification.
Baseline Design Strategies. We consider two base-
line design strategies: (i)
EvenDist
: All servers are
Power-savingDesigninServerFarmsforMulti-tierApplicationsunderResponseTimeConstraint
143
Table 2: Server profiles.
γ r
l
α (Watt) P
I
(Watt)
Type-A server 1 0.4 100 180
Type-B server 1.2137 0.5161 12.3977 36.7040
Type-C server 3 0.4 455 150
0.5 0.6 0.7 0.8 0.9 1
36
37
38
39
40
41
42
speed ratio r
Power consumption (Watt)
Measured power
Modeled power
Figure 4: Measured and modeled power.
evenly assigned to 3 tiers. All tiers will almost have
the same number of servers with at most 1-server dif-
ference. (ii)
PropDist
: All servers are proportionally
assigned to 3 tiers according to the absolute utilization
of the tier, i.e.,
λ
m
µ
m
. In each baseline, all servers are al-
ways activated with the maximum processor speed,
i.e., r
m
= 1. An optimization approach similar to pre-
vious section to find the minimum required servers
for each baseline. For
PowerTier
, we use the result in
Section 4 to determine the v
m
at each tier and r
m
for
each server.
Evaluation in the Planning or Upgrading Phase.
First we determine the optimal number of servers al-
located to each tier for the peak service demand in
the planning or upgrading phase. We consider the fol-
lowing peak service demand: in the open-queueing
model, the maximum arrival
ˆ
λ
1
at Tier 1 is 800 per
second; in the closed-queueing model, the maximum
number of sessions
ˆ
N
1
at Tier 1 is 200. The response
time constraint is fixed as
ˆ
R =
200
µ
1
= 0.24 sec. Table 3
shows the resulting server allocation and the corre-
sponding speed assignment for each design strategy.
We observe that
EvenDist
needs significantly
more processors than the others in all cases. The
server assignment at each tier for
PowerTier
and
PropDist
are pretty similar in all cases. In other
words, the optimal design under peak demand may
adopt proportional server allocation at tiers.
Pow-
erTier
might need extra more servers than
PropDist
(as shown in Type-C Server) since the enable DVFS
in
PowerTier
could use more servers with lower speed
to reduce power consumption while
PropDist
always
uses the highest speed.
Evaluation in the Running Phase. Second we
compare the performance of
PowerTier
with the base-
line design strategies for the running phase in the fol-
lowing two scenarios:
5.2.1 Fixed Mean Response Time Constraint
In this experiment, we fix the mean response time
constraint as
ˆ
R =
200
µ
1
= 0.24 second. We conduct
evaluation in both open/closed-queueing models.
In the open-queueing model, we vary λ
1
from 0 to
800 per second and choose different types of servers.
The results are shown in Figure 5. In the closed-
queueing model, we vary N
1
from 0 to 200 and choose
different types of servers. Similarly, we obtain the re-
sults as shown in Figure 6.
The subfigures in the first, second, and third rows
are the optimal power consumption for all design
strategies, and the corresponding optimal r
m
and v
m
for
PowerTier
respectively. The different columns
are the cases for all three type servers. All the
sawtooth-shaped processor curves are due to the re-
optimization with the consideration of the integer
value of v
m
in
PowerTier
.
Both open/closed-queueing models reveal the
similar phenomenon. In all the cases
PowerTier
out-
performs the others. Both the power consumption
under
PowerTier
and
EvenDist
seem linearly pro-
portionally to the traffic arrival rate (in the open-
queueing model) or the number of sessions (in the
closed-queueing model). But the increasing slope for
EvenDist
is larger. For
PropDist
, when the traffic is
not heavy, it consumes significant amount of power
due to the constraint of the proportional distribution
with at least one server at each tier. When the traffic
is heavy,
PropDist
approaches to the optimal design
PowerTier
.
5.2.2 Fixed Service Demand
In this experiment, we fix the service demand and
vary the response time constraint. We also conduct
evaluation in both open/closed-queueing models. We
fix the service demand intensity as 25% with respect
to the peak one, and vary
ˆ
R from
100
µ
1
= 0.12 second
to
400
µ
1
= 0.48 second.
In the open-queueing model, we choose the re-
quest arrival rate as λ
1
= 200 per second. Figure 7
SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems
144
Table 3: Server allocation and processor speed assignment in the planning or upgrading phase.
(a) Open-Queueing Model
Strategies Type-A Server Type-B Server Type-C Server
PowerTier
(v
1
, v
2
, v
3
) (2, 18, 69) (2, 18, 69) (2, 19, 73)
(r
1
, r
2
, r
3
) (0.7, 0.98, 1) (0.74, 0.98, 0.99) (0.84, 0.95, 0.95)
EvenDist
(v
1
, v
2
, v
3
) (65, 65, 65) (65, 65, 65) (65, 65, 65)
(r
1
, r
2
, r
3
) (1, 1, 1) (1, 1, 1) (1, 1, 1)
PropDist
(v
1
, v
2
, v
3
) (2, 18, 69) (2, 18, 69) (2, 18, 69)
(r
1
, r
2
, r
3
) (1, 1, 1) (1, 1, 1) (1, 1, 1)
(b) Closed-Queueing Model
Strategies Type-A Server Type-B Server Type-C Server
PowerTier
(v
1
, v
2
, v
3
) (2, 16, 63) (2, 16, 63) (2, 17, 67)
(r
1
, r
2
, r
3
) (0.63, 1, 0.99) (0.69, 1, 0.99) (0.81, 0.95, 0.95)
EvenDist
(v
1
, v
2
, v
3
) (59, 59, 59) (59, 59, 59) (59, 59, 59)
(r
1
, r
2
, r
3
) (1, 1, 1) (1, 1, 1) (1, 1, 1)
PropDist
(v
1
, v
2
, v
3
) (2, 16, 63) (2, 16, 63) (2, 16, 63)
(r
1
, r
2
, r
3
) (1, 1, 1) (1, 1, 1) (1, 1, 1)
200 400 600 800
0
10
20
30
40
λ
1
E [P
] (kW)
EvenDist
PropDist
PowerTier
200 400 600 800
0
0.2
0.4
0.6
0.8
1
λ
1
r
r
1
*
r
2
*
r
3
*
200 400 600 800
0
20
40
60
80
λ
1
v
v
1
*
v
2
*
v
3
*
(a) Type-A Servers
(b) Type-B Servers
200 400 600 800
0
10
20
30
40
λ
1
E [P
] (kW)
EvenDist
PropDist
PowerTier
200 400 600 800
0
0.2
0.4
0.6
0.8
1
λ
1
r
r
1
*
r
2
*
r
3
*
200 400 600 800
0
20
40
60
80
λ
1
v
v
1
*
v
2
*
v
3
*
(c) Type-C Servers
Figure 5: Comparison in the open-queueing model for
ˆ
R = 0.24 second.
Power-savingDesigninServerFarmsforMulti-tierApplicationsunderResponseTimeConstraint
145
50 100 150 200
0
10
20
30
40
50
60
N
1
E [P
] (kW)
EvenDist
PropDist
PowerTier
50 100 150 200
0
0.2
0.4
0.6
0.8
1
N
1
r
r
1
*
r
2
*
r
3
*
50 100 150 200
0
10
20
30
40
50
N
1
v
v
1
*
v
2
*
v
3
*
(a) Type-A Servers
(b) Type-B Servers
50 100 150 200 250 300
0
10
20
30
40
N
1
E [P
] (kW)
EvenDist
PropDist
PowerTier
50 100 150 200 250 300
0
0.2
0.4
0.6
0.8
1
N
1
r
r
1
*
r
2
*
r
3
*
50 100 150 200 250 300
0
20
40
60
80
100
N
1
v
v
1
*
v
2
*
v
3
*
(c) Type-C Servers
Figure 6: Comparison in the closed-queueing model for
ˆ
R = 0.24 second.
0.2 0.3 0.4
4
6
8
10
12
14
16
18
ˆ
R (sec)
E [P
] (kW)
EvenDist
PropDist
PowerTier
(a) Type-A Servers
0.2 0.3 0.4
0.5
1
1.5
2
2.5
3
3.5
ˆ
R (sec)
E [P
] (kW)
EvenDist
PropDist
PowerTier
(b) Type-B Servers
0.2 0.3 0.4
4
6
8
10
12
14
16
ˆ
R (sec)
E [P
] (kW)
EvenDist
PropDist
PowerTier
(c) Type-C Servers
Figure 7: Comparison in the open-queueing model for λ
1
= 200 per second.
shows the power consumption comparison. In the
closed-queueing model, we choose the number of ses-
sions as N
1
= 50. Similarly, we obtain the results as
shown in Figure 8.
In all design strategies, as
ˆ
R increases, the power
consumption reduces. When
ˆ
R is very small, i.e.,
the timing requirement is more stringent, then more
servers will be activated, which consumes more
power.
PowerTier
alway outperforms the others. For
larger response time threshold
ˆ
R,
EvenDist
outforms
PropDist
. For smaller response time threshold
ˆ
R,
PropDist
outforms
EvenDist
.
SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems
146
0.2 0.3 0.4
0
5
10
15
20
25
ˆ
R (sec)
E [P
] (kW)
EvenDist
PropDist
PowerTier
(a) Type-A Servers
0.2 0.3 0.4
0.5
1
1.5
2
2.5
3
3.5
4
4.5
ˆ
R (sec)
E [P
] (kW)
EvenDist
PropDist
PowerTier
(b) Type-B Servers
0.2 0.3 0.4
0
5
10
15
20
ˆ
R (sec)
E [P
] (kW)
EvenDist
PropDist
PowerTier
(c) Type-C Servers
Figure 8: Comparison in the closed-queueing model for N
1
= 50.
6 CONCLUSIONS AND FUTURE
WORK
In this paper, we have explored power-saving de-
sign on server farms running multi-tier web appli-
cations under a given SLA. We proposed an effi-
cient power-saving design strategy
PowerTier
, which
jointly considers both DVS and DPM power-saving
techniques. Specifically, we have considered two ap-
plication models: the open-queueing model and the
closed-queueing model for session-less and session-
based web applications respectively. With
PowerTier
,
we are able to optimally determine the number of
servers needed at each tier for the peak service de-
mand in the planning or upgrading phase. And also
in the running phase, for different service demands
and request visit ratios, we are able to optimally de-
termine the number of activated/deactivated servers at
each tier and the processor speed for each activated
server. The simulation results showed that
PowerTier
is able to efficiently save the power consumption of
server farms while meeting the response time con-
straint for multi-tier applications. Our simulation has
also showed that the optimal server allocation under
the open-queueing and closed-queueing models are
quite different due to the different application behav-
iors.
This paper focused on homogeneous servers at
each tier. One of potential future work is to extend the
current work to heterogeneous servers by taking into
consideration the different characteristics of servers in
the power consumption and response analysis. How-
ever, the optimal design is much more complex and
challenging in this case. This paper also targets on the
stable workload. In the future work, we would like to
exploit a control mechanism to activate and deactivate
servers dynamically. The control period is assumed to
be sufficiently long to pay off the energy and timing
overhead of the activation and deactivation of servers.
ACKNOWLEDGEMENTS
This work is sponsored in part by NSF CAREER
Grant No. CNS-0746906, Baden Wuerttemberg
MWK Juniorprofessoren-Programms, NSERC Dis-
covery Grant 341823, FQRNT grant 2010-NC-
131844, CFI Leaders Opportunity Fund 23090, and
National Science Foundation Award 1116606 and
1117664.
REFERENCES
Bohrer, P., Elnozahy, E., Keller, T., Kistler, M., Lefurgy, C.,
McDowell, C., and Rajamony, R. (2002). The case
for power management in web servers. Power Aware
Computing, pages 261–289.
Cecchet, E., Marguerite, J., and Zwaenepoel, W. (2002).
Performance and scalability of EJB applications.
ACM Sigplan Notices, 37:246–261.
Diao, Y., Hellerstein, J., Parekh, S., Shaikh, H., Surendra,
M., and Tantawi, A. (2006). Modeling differentiated
services of multi-tier web applications. In IEEE Inter-
national Symposium on Modeling, Analysis, and Sim-
ulation.
Gandhi, A., Harchol-Balter, M., Das, R., and Lefurgy, C.
(2009). Optimal power allocation in server farms. In
ACM SIGMETRICS.
Guerra, R., Leite, J., and Fohler, G. (2008). Attaining
soft real-time constraint and energy-efciency in web
servers. In ACM symposium on Applied computing.
Heo, J., Henriksson, D., Liu, X., and Abdelzaher, T. (2007).
Integrating adaptive components: An emerging chal-
lenge in performance-adaptive systems and a server
farm case-study. In IEEE Real-Time Systems Sympo-
sium.
Jain, R. (1991). The art of computer systems performance
analysis: techniques for experimental design, mea-
surement, simulation, and modeling. John Wiley &
Sons Inc.
Kamra, A., Misra, V., and Nahum, E. (2004). Yaksha: a
self-tuning controller for managing the performance
Power-savingDesigninServerFarmsforMulti-tierApplicationsunderResponseTimeConstraint
147
of 3-tiered web sites. In IEEE International Workshop
on Quality of Service.
Kleinrock, L. (1976). Queueing Systems Volume II: Com-
puter applications. Wiley Interscience.
Lazowska, E., Zahorjan, J., Graham, G., and Sevcik, K.
(1984). Quantitative system performance: com-
puter system analysis using queueing network models.
Prentice-Hall, Inc. Upper Saddle River, NJ, USA.
Liu, X., Heo, J., and Sha, L. (2005). Modeling 3-tiered web
applications. In IEEE International Symposium on
Modeling, Analysis, and Simulation of Computer and
Telecommunication Systems, 2005, pages 307–310.
Liu, X., Heo, J., Sha, L., and Zhu, X. (2006). Adaptive
control of multi-tiered web applications using queue-
ing predictor. In IEEE/IFIP Network Operations and
Management Symposium.
Liu, X., Heo, J., Sha, L., and Zhu, X. (2008). Queueing-
model-based adaptive control of multi-tiered web ap-
plications. IEEE Transactions on Network and Service
Management, 5(3):157–167.
Pacifici, G., Segmuller, W., Spreitzer, M., Steinder, M.,
Tantawi, A., and Youssef, A. (2005). Managing the
response time for multi-tiered web applications. Tech-
nical Report RC 23651, IBM.
Raghavendra, R., Ranganathan, P., Talwar, V., Wang, Z.,
and Zhu, X. (2008). No “power” struggles: coordi-
nated multi-level power management for the data cen-
ter. In International conference on Architectural sup-
port for programming languages and operating sys-
tems.
Reiser, M. and Lavenberg, S. S. (1980). Mean-value anal-
ysis of closed multichain queuing networks. J. ACM,
27(2):313–322.
Rolia, J. and Sevcik, K. (1995). The method of layers. IEEE
Transactions on Software Engineering, 21(8):689–
700.
Rusu, C., Ferreira, A., Scordino, C., and Watson, A. (2006).
Energy-efficient real-time heterogeneous server clus-
ters. In IEEE Real-Time and Embedded Technology
and Applications Symposium.
Sharma, V., Thomas, A., Abdelzaher, T. F., Skadron, K.,
and Lu, Z. (2003). Power-aware QoS management in
web servers. In IEEE Real-Time Systems Symposium.
Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., and
Tantawi, A. (2005). An analytical model for multi-tier
internet services and its applications. ACM SIGMET-
RICS Performance Evaluation Review, 33(1):302.
U.S. Environmental Protection Agency (EPA) (2007). Re-
port to congress on server and data center energy effi-
ciency, public law 109-431.
Wang, L. and Lu, Y. (2008). Efficient power management of
heterogeneous soft real-time clusters. In IEEE Real-
Time Systems Symposium.
Wang, P., Qi, Y., Liu, X., Chen, Y., and Zhong, X. (2010).
Power management in heterogeneous multi-tier web
clusters. In International Conference on. IEEE.
Wierman, A., Andrew, L. L. H., and Tang, A. (2009).
Power-aware speed scaling in processor sharing sys-
tems. In IEEE INFOCOM.
SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems
148