Quantifying Energy Usage in Data Centers through Instruction-count

Overhead

K. F. D. Rietveld and H. A. G. Wijshoff

LIACS, Leiden University, Niels Bohrweg 1, 2333CA, Leiden, The Netherlands

Keywords:

Energy Saving, Software Overhead, Instruction Reduction, Web Applications.

Abstract:

Energy usage of data centers is rising quickly and the electricity cost can no longer be neglected. Most

efforts to relieve the increase of energy usage concentrate on improving hardware efﬁciency, by improving the

hardware itself or by turning to server virtualization. Yet, no serious effort is made to reduce electricity usage

by targeting the software running in data centers. To be able to effectively target software, a quantiﬁcation

of software overhead is necessary. In this paper, we present a quantiﬁcation of the sources of overhead in

applications that are these days ubiquitous in data centers: web applications. Experiments with three web

applications show that up to 90% of the instructions executed to generate web pages are non-essential, in

other words overhead, and can be eliminated. Elimination of these non-essential instructions results in an

approximately linear decrease in page generation time as well as signiﬁcantly reduced energy usage. In order

to get the rising energy cost of data centers under control it is obligatory to be able to quantify the source of

energy cost. In this paper we present an approach how to quantify wasted energy based on a quantiﬁcation of

non-essential instructions that are executed.

1 INTRODUCTION

Applications that have seen a steady rise in ubiquity

in the last 10 years are web applications. In many data

centers web applications are hosted that provide much

of the World-Wide Web’s content. Web applications

are often built from several readily available compo-

nents to speed up development, such as web devel-

opment frameworks and database management sys-

tems (DBMSs). This modularity allows for rapid pro-

totyping, development and deployment. The World-

Wide Web has heavily beneﬁted from this modular

approach.

However, it is well known that modularity does

not come for free. The use of rapid development

frameworks and other re-usable modules comes at the

cost of reduced performance. In a time where energy

consumption of data centers is becoming a greater and

greater concern, the call to break down the layers in

order to regain efﬁciency will become bigger. To be

able to effectively target efforts to reduce energy con-

sumption of web applications, the sources of overhead

must be quantiﬁed. An instruction-level quantiﬁca-

tion of overhead in web applications does not exist to

our knowledge. Therefore, in this paper, we present

an initial study to quantify the overhead induced by

this modular and layered approach. We argue that this

overhead is signiﬁcant.

In the last decade, the increasing energy consump-

tion of data centers has already drawn the attention

of governments. The US Environment Protection

Agency (EPA) reported on the energy efﬁciency of

data centers in a 2007 report (U.S. Environmental

Protection Agency, 2007). Their study says energy

consumption of US data centers had doubled in the

period from late 2000 to 2006. Based on this trend,

they projected another doubling in electricity use for

the period from 2006 to 2011. This would mean a

quadrupling in electricity use by data centers in about

10 years time.

Contrary to the EPA prediction, a 2011 report by

Jonathan G. Koomey (2011) claims electricity by US

data centers increased by 36% instead of doubling

from 2005 to 2010. This is signiﬁcantly lower than

predicted by the EPA report. Koomey attributes this

to a lower server installed base than predicted earlier,

caused by the 2008 ﬁnancial crises and further im-

provements in server virtualization. Nevertheless, the

energy consumption is still increasing and while the

increase was only 36% in the US, according to the

Koomey report the increase amounted to 56% world-

wide.

The server installed base is an important metric,

because each installed server adds up to the amount

189

F. D. Rietveld K. and A. G. Wijshoff H..

Quantifying Energy Usage in Data Centers through Instruction-count Overhead.

DOI: 10.5220/0004357901890198

In Proceedings of the 2nd International Conference on Smart Grids and Green IT Systems (SMARTGREENS-2013), pages 189-198

ISBN: 978-989-8565-55-6

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

of electricity used, not only due to the energy used

by the server itself but also because cooling capacity

has to be increased. The EPA report makes many rec-

ommendations to reduce electricity use in data cen-

ters. Many of these recommendations seek for so-

lutions in hardware. Better power management can

make servers more energy efﬁcient. The server in-

stalled base can be reduced by making better use of

virtualization to consolidate servers. Recommenda-

tions are also made to develop tools and techniques

to make software more efﬁcient by making better use

of parallelization and to avoid excess code. However,

these recommendations are not as concrete as those

for hardware improvements.

An interesting observation in the EPA report is

that those responsible for selecting and purchasing

computer equipment are not the same as those respon-

sible for power and cooling infrastructure. The latter

typically pay the electricity bills. This leads to a split

incentive, because those who could buy more energy

efﬁcient hardware have little incentive to do so. We

believe a similar split incentive exists with software

developers. Software developers are not in charge of

obtaining the necessary computing equipment in data

centers and also are not aware of the electricity bills.

Therefore, software developers have very little incen-

tive to further optimize the software to reduce energy

consumption. The fact that software optimization is a

very diligent and costly task does not help.

Many statistics have been collected on data center

cost and performance. For example Computer Eco-

nomics, Inc. (2006) describes several metrics that are

used in data center management to optimize perfor-

mance and drive costs down. Many of these metrics

concentrate around MIPS or the number of servers

and processors. Mainframe size is expressed in MIPS,

and cost can be expressed in spending (on hardware,

software, personnel, etc.) per MIPS. Statistics on

Intel-based UNIX and Windows operating environ-

ments are expressed in number of servers and pro-

cessors. In such statistics a distinction is made be-

tween installed MIPS versus used MIPS. These num-

bers should not be too far apart, because server ca-

pacity staying idle is neither cost nor energy efﬁcient.

However, as soon as a server is no longer idle, the

work performed is counted as used MIPS. Of such

used MIPS it is not investigated whether these MIPS

did useful work or were mainly overhead. In other

words, no distinction is made between essential MIPS

and non-essential MIPS, with non-essential MIPS be-

ing accrued from the cost of the usage of rapid devel-

opment frameworks and software modularity.

In this paper, we address the quantiﬁcation of en-

ergy usage through non-essential instruction (MIPS)

count overhead. We investigate three existing and

representative web applications and show that in this

manner accurate measurements on spilled energy us-

age can be obtained. To our belief, this will result in

an incentive for data centers to prioritize the need for

software overhead reduction instead of only improv-

ing hardware energy efﬁciency.

In Section 2 we give an overview of how non-

essential MIPS are determined in web applications.

Section 3 describes the experimental setup and the

web applications used. The sources of overhead are

quantiﬁed in Section 4. Section 5 links the results of

the experiments to the expected reduction in energy

consumption. Section 6 discusses work related to this

paper. Finally, Section 7 lists our conclusions.

2 DETERMINATION OF

NON-ESSENTIAL MIPS

We have deﬁned a number of categories of over-

head, or non-essential MIPS, in web applications that

make use of the PHP language and a MySQL DBMS,

which we will quantify. The instruction count of

non-essential MIPS for each category is obtained by

counting instructions in the original program code

and in the program code with the overhead source

removed. The difference in instruction count is the

number of instructions for the corresponding over-

head source.

The ﬁrst category of overhead we will consider is

overhead caused by development frameworks for web

applications. Many applications are written with the

use of a framework to dramatically shorten develop-

ment time and increase code re-use. One of the appli-

cations we survey in this paper has been developed us-

ing the CakePHP framework. The CakePHP simpli-

ﬁes development of web applications by performing

most of the work serving web requests and abstract-

ing away the low-level data access through a DBMS.

Frameworks like this affect performance by, for ex-

ample, performing unnecessary iterations and copies

of results sets, or even by executing queries of which

the results are not used at all.

The second category of overhead is the PHP lan-

guage itself. Due to PHP’s nature as a script language,

there is a start-up overhead due to parsing and inter-

pretation of the source ﬁles and the potential to thor-

oughly optimize the code in a similar way to compiled

languages is removed.

As a third category, we consider the current mod-

ular design of DBMSs, which has as result that a sin-

gle DBMS instance can be easily used for a variety

of applications. A downside of this modularity is that

SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems

190

when an application has a request to retrieve data, this

request has to go out of process. Depending on the ar-

chitecture of the website, the request is either served

by a DBMS running on the same server as the web

server executing the PHP code or the request is sent

to a remote DBMS host. Overhead that is incurred

can include: context switching overhead, network and

protocol overhead and data copying overhead.

The fourth category of overhead is caused by

DBMS APIs implemented in shared libraries. The li-

brary sends SQL queries to the DBMS and retrieves

the results. As a result of this architecture details on

the data accesses are shielded off and also overhead is

introduced by iterating result sets at least twice: once

when the DBMS builds up the result set and sends this

to the application program and once in the application

program itself when the results are iterated.

3 EXPERIMENTAL SETUP

In this paper we quantify three web applications: dis-

cus, RUBBoS (ObjectWeb Consortium, n.d.-a) and

RUBiS(ObjectWeb Consortium, n.d.-a) and RUBiS

(ObjectWeb Consortium, n.d.-b). Although it can be

questioned whether these three applications are repre-

sentative of typical web applications running in data

centers, we are convinced that at least these bench-

marks can be used to create an initial quantiﬁcation

of the overhead. In our experiments, we have focused

on read-only workloads.

discus

discus is an in-house developed student administra-

tion system. The system is based on CakePHP de-

velopment framework, version 1.2.0. To quantify

the overhead for full page generations, all forms of

caching (e.g. query or page caching) in CakePHP

have been disabled. We focus on two particular

page loads: the students index and the individual stu-

dent view. All experiments have been carried out

with three different data sets: small (25.000 tuples),

medium (0.5 million tuples) and large (10 million tu-

ples).

RUBBoS

The RUBBoS benchmark was developed by a col-

laboration between RICE University and INRIA and

models a typical bulletin board system or news web-

site with possibility to post comments (ObjectWeb

Consortium, n.d.-a). We have used the PHP-version of

RUBBoS. To quantify the overhead in different typi-

cal pages served by RUBBoS, we looked at the page

generation times of the following pages: StoriesOfT-

heDay, BrowseStoriesByCategory, ViewStory, View-

Comment. We believe these pages are typical for the

workload that characterizes a news story website. As

data set we have used the data set that is made avail-

able at the RUBBoS website.

RUBiS

The RUBiS benchmark (ObjectWeb Consortium,

n.d.-b) models a simple auction website and is sim-

ilar to the RUBBoS benchmark as it has been devel-

oped by the same collaboration. For our tests with

RUBiS we have used the same methodology as with

RUBBoS. The PHP-version of RUBiS was used and

the following pages were picked: ViewBidHistory,

ViewItem, ViewUserInfo and SearchItemsByCategory.

The data set that is available from the RUBiS website

has been used as data set in our experiments.

The experiments have been carried out on an Intel

Core 2 Quad CPU (Q9450) clocked at 2.66 GHz with

4 GB of RAM. The software installation consists out

of Ubuntu 10.04.3 LTS (64-bit), which comes with

Apache 2.2.14, PHP 5.3.2 and MySQL 5.1.41. No ex-

traordinary changes were made to the conﬁguration

ﬁles for Apache and MySQL, except that in MySQL

we have disabled query caching to be able to consis-

tently quantify the cost for executing the necessary

queries. In the experiments we obtained two met-

rics. Firstly, we obtained the page generation time

which we deﬁne as the difference between the time

the ﬁrst useful line of the PHP code started execution

until the time the last line of code is executed. Sec-

ondly, we acquire the number of instructions executed

by the processor to generate the page by reading out

the INST_RETIRED hardware performance counter of

the CPU.

4 QUANTIFICATION

In this section, we will quantify the overhead for each

of the categories described in Section 2. To quantify

the overhead induced by the PHP programming lan-

guage we have compared the performance of the three

code bases to the code bases compiled to native ex-

ecutables using the HipHop for PHP project(HipHip

for PHP Project, n.d.). This project is developed

by Facebook and is a compiler that translates PHP

source code to C++ source code, which is linked

against a HipHop runtime that contains implementa-

tions of PHP built-in functions and data types. Both

the Apache HTTP server and the PHP module are re-

placed by the HipHop-generated executable. In Fig-

QuantifyingEnergyUsageinDataCentersthroughInstruction-countOverhead

191

Figure 1: Page generation time in seconds for different pages and data set sizes from discus. Each category (e.g. “Index

20/page”) contains up to 10 different page loads over which the average over 5 runs is taken.

Figure 2: Page generation time in microseconds for differ-

ent pages from the RUBBoS benchmark. Times displayed

are averages of 10 runs on a warmed-up server.

ures 1, 2 and 3 the difference in page generation time

between Apache and the HipHop-compiled executa-

bles is shown as “PHP overhead”. Figures 4, 5 and 6

depict the number of instructions executed to gener-

ate the page. For the discus code base we observe that

roughly 1/3 to 2/3 of the page generation time can

be attributed to PHP overhead. A similar overhead is

found in the ﬁgures depicting the number of instruc-

tions

The RUBBoS and RUBiS benchmarks do not

show an improvement in page generation time when

the HipHop-version is tested. However, the instruc-

tion count does noticeably decrease. Compared to

discus, the RUBBoS and RUBiS source code is quite

straightforward and it is possible that aggressive com-

piler optimizations do not have much effect. Although

less instructions are retired, it is plausible that instead

more time is spent waiting for (network) I/O.

To quantify the overhead of development frame-

works, we have focused on one of the main sources

of overhead in the CakePHP framework, which is the

data access interface or in particular the automatic

generation of SQL queries. The difference between

the code bases with and without automatic query gen-

Figure 3: Page generation time in microseconds for differ-

ent pages from the RUBiS benchmark. Times displayed are

averages of 10 runs on a warmed-up server.

eration is displayed in Figures 1 and 4 as “CakePHP

gen. overhead”. For most cases, the overhead of au-

tomatic query generation ranges from 1/12 to 1/5 of

the total page generation time. This is signiﬁcant if

one considers the total page generation time in the or-

der of seconds and the fact that there is no technical

obstacle to make the queries static after the develop-

ment of the application. CakePHP employs caching to

get around this overhead (which was disabled to un-

cover this overhead). However, caching does not help

if similar queries are often executed with different pa-

rameters. We have not done similar experiments for

RUBBoS and RUBiS, because these code bases use

static queries instead of a rapid development frame-

work.

We made the application code compute the results

of the SQL queries instead of sending these queries to

a different DBMS process, to quantify the overhead

of using common DBMS architecture. This was done

by modifying the HipHop-compiled discus, RUBBoS

and RUBiS code bases by replacing calls to MySQL

API with code that performs the requested query. For

example, calls to the function mysql_query were re-

SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems

192

Figure 4: Instruction count in 10

s of instructions for different pages and data set sizes from discus.

Figure 5: Instruction count in 10

of instructions for differ-

ent pages from the RUBBoS benchmark.

placed with a complete code that performs the re-

quested query and ﬁlls an array with the result tu-

ples. Algorithmically seen, the query is performed

in exactly the same manner as it would have been

performed by MySQL. Using the Embedded MySQL

Server Library (MySQL Project, n.d.), it is also pos-

sible to perform SQL queries within the client process

without contacting a remote DBMS. This approach

does not address the DBMS API Overhead as de-

scribed in Section 2, because the generic MySQL API

function calls remain in use, shielding off the data ac-

cess from the application code. Furthermore, our ob-

jective is to obtain a minimum amount of instructions

required for processing queries, to emphasize the cost

of using a generic, modular DBMS.

The time results are displayed in Figures 1, 2 and

3. Compared to the original execution time of the

Apache-version of the discus application, the MySQL

overhead accounts for roughly 8% to 40% of the ex-

ecution time. If we compare the MySQL overhead to

the execution time of the original HipHop-compiled

version however, the MySQL overhead accounts for

roughly 20% to 60% of the execution time. These

larger numbers are also reﬂected in the RUBBoS and

Figure 6: Instruction count in 10

s of instructions for dif-

ferent pages from the RUBiS benchmark.

RUBiS results, where the MySQL overhead is esti-

mated to be around 72% to 90% of the page genera-

tion time. Put differently, elimination of the MySQL

server improved the page generation times for the

RUBBoS and RUBiS applications by about a factor

10. We again see similarly sized reductions in instruc-

tion count in Figures 4, 5 and 6.

Once the SQL queries have been expanded to code

inside the application code, several sources of over-

head of DBMS APIs become apparent. It is now pos-

sible to more tightly integrate the application code

with the query computation code. In the majority

of the cases the loops that iterate over the result set

can be merged into the loops that perform the actual

queries and thus build the result sets, saving an itera-

tion of the result set.

In Figure 1 we observe that it is possible to achieve

at least a factor 2 speedup in most cases compared to

the HipHop-version of discus. For the Index 100/page

with the large data set case, a factor 10 speedup is

obtained. Although the overhead of the DBMS API

does not appear that signiﬁcant compared to the to-

tal execution time, it does prove to be of signiﬁcance

when put in the perspective of the execution time of

QuantifyingEnergyUsageinDataCentersthroughInstruction-countOverhead

193

the application when the usage of MySQL has been

removed. The same reasoning holds true when the

results of the RUBBoS and RUBiS applications are

analyzed, displayed in Figures 2 and 3 respectively.

When compared to the total execution time of the

derivative of the application with the MySQL over-

head removed, removing the SQL API overhead re-

sults in speedups close to a factor of 2 in half of the

cases.

5 ENERGY CONSUMPTION

From the results collected we can deduce that in gen-

eral the time spent per instruction stays in the same or-

der of magnitude both when non-essential (overhead)

instructions are removed as well as when the prob-

lem size is increased. Overall, there is an approx-

imately linear relation between the page generation

time and the amount of instructions executed to gen-

erate a page.

For the applications that have been surveyed, re-

moval of non-essential instructions has an immediate

approximately linear impact on performance.

Table 1: Displayed is the ratio of non-essential instructions

executed for each essential instruction.

Ratio

20/page

Small

14.92

Medium

23.26

Large

41.67

40/page

Small

22.72

Medium

30.30

Large

45.45

100/page

Small

43.48

Medium

38.46

Large

47.62

Indiv. View

Small

4.69

Medium

6.17

Large

18.52

Table 2: Displayed is the ratio of non-essential instructions

executed for each essential instruction.

Ratio

RUBBoS, BrowseStByCat.

1.56

RUBBoS, ViewComments.

14.29

RUBBoS, ViewStory.

21.28

RUBBoS, StoriesOfTheDay.

25.64

RUBiS, SearchItByCat.

4.4

RUBiS, ViewBidHistory

11.83

RUBiS, ViewItem-qty-0

17.54

RUBiS, ViewItem-qty-5

80.64

RUBiS, ViewUserInfo

32.26

Tables 1 and 2 show the number of non-essential

instructions that are executed for each essential in-

struction for the discus and RUBBoS/RUBiS exper-

iments respectively. The overhead varies greatly from

page to page. In the majority of cases however, the

overhead is signiﬁcant: more than 20 non-essential in-

structions are executed for each essential instruction.

An overhead of a factor 20.

In the discus experiments we observe a trend that

as the data set size increases, the overhead increases

as well. Put differently, the overhead expands when

the same code is ran on a larger data set. It is very well

possible that this increase in overhead is caused by the

data reformatting done in the CakePHP framework,

when the result set received from the DBMS is for-

matted into an array that can be used for further pro-

cessing by CakePHP framework objects. The fact that

the overhead of the MySQL and SQL API categories

increases linearly with the data size and thus stays

constant per row also indicates that the cause of this

increase in overhead is to be sought in the CakePHP

framework. From this result it can be expected that

in this case the reduction in non-essential instructions

will be larger as the data set size increases.

Table 3: The Instruction Time Delay (ITD) is shown, which

is the ratio of the average time per essential instruction

against the average time per non-essential instruction. The

latter is the weighted average of the time per instruction of

the different overhead categories.

ITD

RUBBoS, BrowseStByCat. 0.42

RUBBoS, ViewComments. 0.84

RUBBoS, ViewStory. 0.90

RUBBoS, StoriesOfTheDay. 1.55

RUBiS, SearchItByCat. 0.91

RUBiS, ViewBidHistory 0.91

RUBiS, ViewItem-qty-0 1.16

RUBiS, ViewItem-qty-5 3.63

RUBiS, ViewUserInfo 1.43

discus 20/page, Medium 1.42

discus 40/page, Medium 1.30

discus 100/page, Medium 1.15

discus Indiv. View, Medium 1.03

We showed a signiﬁcant reduction in MIPS to be

executed by a factor of 10. However, if we look at

the ratio of time per essential instruction versus the

average time per non-essential instruction (instruc-

tion time delay) for the web applications we have

surveyed, depicted in Table 3, we observe that for

most cases the average time per essential instruction

is larger than the average time per non-essential in-

struction. A possible explanation for this is that the

majority of essential instructions are expected to be

SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems

194

Table 4: The Instruction Time Delay (ITD) is shown, simi-

lar to Table 3, but for discus experiments with varying data

set sizes.

ITD

20/page

Small 1.24

Medium 1.42

Large 1.50

40/page

Small 1.18

Medium 1.30

Large 1.45

100/page

Small 0.55

Medium 1.15

Large 1.46

Indiv. View

Small 0.91

Medium 1.03

Large 1.25

carrying out memory access or disk I/O. As a result,

when estimating energy reduction this has to be taken

into account, see below. In case of RUBBoS and RU-

BiS, this trend is not always visible. Possibly, this

is because the time spent by the essential instructions

is quite small, due to small transfers of data, so that

these do not weigh up to the time spent by the non-

essential instructions.

Table 4 displays the instruction time delay for dis-

cus experiments with varying data set sizes. For all

cases, the time delay increases as the data set size in-

creases. Or, as non-essential instructions are elimi-

nated the essential instructions responsible for fetch-

ing the data from memory will take more time to ex-

ecute. The memory wall becomes more exposed as

overhead is removed. Recall from Table 1 that the

number of overhead instructions increases with the

data set size, however, the instruction time delay in-

creases as well and this will counteract the increase in

overhead.

When we consider the majority of the essential in-

structions to be carrying out memory access or disk

I/O, we have to consider the cost of such data accesses

in our conservative estimates of impact on energy us-

age. We use the component peak powers and ac-

tual peak power, which is 60% of the advertised peak

power, of a typical server described by Fan, Weber,

and Barroso (2007). In the very worst case, essential

instructions are the most expensive in energy usage

and non-essential the cheapest, the removal of over-

head would then only remove the cheap instructions.

If we take the typical machine’s actual peak power

usage of just the CPU and memory for the essential

instructions and the idle power usage (estimated at

45% of actual peak power) for the CPU and mem-

ory for the non-essential instructions, we obtain a ra-

tio of approximately 69.6W : 31.3W or a factor 2.22.

Let dP be this factor 2.22, dT be the average time

per essential instruction versus the average time per

non-essential instruction (or time delay), reported in

Table 4, and R be the ratio of non-essential versus es-

sential instructions, reported in Table 1. We can then

use the formula



1 −

dT × dP

R + 1



× 100%

to get a worst-case estimate of the energy sav-

ing. Completing this formula for dP = 2.22,dT =

1.45,R = 45.45 gives an estimated energy saving of

93.1%. A little more realistic, we can estimate the en-

ergy saving considering the entire machine by taking

the idle power usage of the entire machine for non-

essential instructions and a peak energy usage of 90%

of actual peak power usage of the entire machine for

the essential instructions. This results in a dP of 2.0.

When we complete the formula with this dP, we get

a slightly higher energy saving estimate of 93.8%.

Our expected energy saving is based on the obser-

vation that essential instructions are more expensive

than non-essential instructions, because essential in-

structions are more frequently instructions that access

data in memory and disk I/O. Benchmarks with typ-

ical servers show that power usage of such servers is

typically between 60% and 80% of actual peak power

(Fan et al., 2007). If we consider essential instructions

to use 80% of actual peak power and non-essential in-

structions to use 60%, we obtain a dP of 1.33. With

the same parameters for dT and R, we estimate an

energy saving of 95.8%.

We can compare the obtained numbers with the

estimated energy saving if we do not make a distinc-

tion in energy used per instruction for essential and

non-essential instructions. This is reﬂected in the fol-

lowing formula:



1 −

R + 1



× 100%

Completing for dT = 1.45, R = 45.45 results in an es-

timated energy reduction of 96.9%. This is higher

than the expected saving, because we do not consider

essential instructions to be more expensive.

Table 5 lists the estimated energy savings for the

various discus experiments. We note that the energy

savings increase along with an increase in data set

size. Even though the instruction time delay is in-

creasing with the size of the data sets, we still observe

an increase in energy saving. We conclude that the

increase in overhead instructions superﬂuously coun-

teracts the instruction time delay. This is due to the

signiﬁcant size of the overhead ratios. Consider for

example discus Index 20/page; where the number of

QuantifyingEnergyUsageinDataCentersthroughInstruction-countOverhead

195

Table 5: Estimated Energy Saving using the expected dP =

1.33, dT obtained from Table 4 and R from Table 1

Estimated

Energy Saving

20/page

Small 89.6%

Medium 92.2%

Large 95.3%

40/page

Small 93.4%

Medium 94.5%

Large 95.8%

100/page

Small 98.4%

Medium 96.1%

Large 96.0%

Indiv. View

Small 78.7%

Medium 80.9%

Large 91.5%

Table 6: Estimated Energy Saving using the expected dP =

1.33, dT obtained from Table 3. Values for R not shown

due to space constraints.

Estimated

Energy Saving

RUBBoS, BrowseStByCat. 78.2%

RUBBoS, ViewComments. 95.0%

RUBBoS, ViewStory. 94.6%

RUBBoS, StoriesOfTheDay. 92.3%

RUBiS, SearchItByCat. 77.6%

RUBiS, ViewBidHistory 90.6%

RUBiS, ViewItem-qty-0 91.7%

RUBiS, ViewItem-qty-5 94.1%

RUBiS, ViewUserInfo 94.3%

overhead instructions doubles for each expansion in

data set size (Table 1), the instruction time delay only

increases with a factor 1.05 to 1.15 (Table 4). Results

obtained using the same formulas for the RUBBoS

and RUBiS experiments are shown in Table 6.

In conclusion, the estimated lower bound on en-

ergy saving we have found in the experiments per-

formed with discus amounts 71% for the Individual

View experiment with the small data set. This result

correlates well with the page generation times dis-

played in Figure 1, where approximately 3/4 of the

page generation time is eliminated when the overhead

is removed. Similar results are found in the results of

the RUBBoS and RUBiS benchmarks, with a lower

bound of 77.6%. We believe this is signiﬁcant, es-

pecially when considering that the CPU and memory

consume almost half of the energy used by the entire

server.

6 RELATED WORK

Research has been done into the performance char-

acteristics of collaborative Web and Web 2.0 applica-

tions that emerged in the last decade. Stewart, Lev-

enti, and Shen (2008) examined a collaborative Web

application, where most content is generated by the

application’s users, and show that there is a funda-

mental difference between collaborative applications

and real-world benchmarks. Nagpurkar et al. (2008)

compareWeb 2.0 workloads to traditional workloads.

Due to the use of JavaScript and AJAX at the receiv-

ing end, many more small requests for data are made.

The research shows that Web 2.0 applications have

more “data-centric behavior” that results in higher

HTTP request rates and more data cache misses. The

quantiﬁcation we presented in this paper does not at-

tempt to characterize workloads, instead it presents a

low-level quantiﬁcation of where time is spent in pro-

gram codes executing web applications.

Altman, Arnold, Fink, and Mitchell (2010) in-

troduced a tool, WAIT, to look for bottlenecks in

enterprise-class, multi-tier deployments of Web appli-

cations. Their tool does not look for hot spots in an

application’s proﬁle, but rather analyzes the causes of

idle time. It is argued that idle time in multi-tier sys-

tems indicates program code blocking on an operation

to be completed.

For our quantiﬁcation we have used the HipHop

for PHP project (HipHip for PHP Project, n.d.) to

translate PHP source code to native executables. Sim-

ilarly, Phalanger compiles PHP source code to the Mi-

crosoft Intermediate Language (MSIL), which is the

bytecode used by the .NET platform. The effect of

PHP performance on web applications has thus been

noted in the past and we believe the existence of these

projects is an indication that PHP overhead is a valid

concern for large PHP code bases.

We have argued that development frameworks for

web applications bring about overhead by, for exam-

ple, the generality of such frameworks and the ab-

stract data access interface. Xu, Arnold, Mitchell,

Rountev, and Sevitsky (2009) argue that large-scale

Java applications using layers of third-party frame-

works suffer from excessive inefﬁciencies that can no

longer be optimized by (JIT) compilers. A main cause

of such inefﬁciencies is the creation of and copying

of data between many temporary objects necessary

to perform simple method calls. One reason why

this problem is not easily targeted is the absence of

clear hotspots. Xu et al. (2009) introduce a technique

called “copy proﬁling” that can generate copy graphs

during program execution to expose areas of common

causes of what the authors refer to as “bloat”.

SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems

196

One of the goals of the DBMS overhead elimina-

tion is to move both the query loop as well as the re-

sult set processing code into the same address space,

so that they can be optimized together as we have

done to estimate the DBMS API overhead. Some

approaches to optimize both the database codes and

the applications codes do already exist, for example,

the work on holistic transformations for web applica-

tions proposed by Manjhi, Garrod, Maggs, Mowry,

and Tomasic (2009) and Garrod, Manjhi, Maggs,

Mowry, and Tomasic (2008). These papers argue that

tracking the relationship between application data and

database data is a tool that might yield advancements.

Note that by eliminated the DBMS overhead we do

exploit this relationship, but rather by eliminating

the relationship by integrating application and DBMS

codes than by tracking this relationship. A similar

approach for holistic transformations for database ap-

plications written in Java is described by Chavan, Gu-

ravannavar, Ramachandra, and Sudarshan (2011).

7 CONCLUSIONS

In this paper we described and quantiﬁed several

sources of overhead in three web applications. This

quantiﬁcation indicates that there is a tremendous po-

tential for optimization of web applications. Of the

total number of instructions executed to generate the

web pages in the investigated applications, close to

90% of the instructions can be eliminated, these are

non-essential instructions. Removal of non-essential

instructions has an approximately linear relationship

with the decrease in page generation time. This re-

sults in faster response times as well as signiﬁcantly

reduced energy usage. We have seen a lower bound

on energy savings of approximately 70% for all ex-

periments performed in this study.

Considering the simplicity of the RUBBoS and

RUBiS code bases, we believe that the estimates

shown for these applications are a lower bound on

the performance that can be gained. Still, the results

are impressive. The more complex discus applica-

tion shows that performance increases between one

and two orders of magnitude are a possibility, solely

by reducing the number of non-essential instructions

that are executed.

For many web applications intensive optimization

of the code is not performed. At a ﬁrst glance, it

appears more cost effective to deploy another set of

servers than to have expensive engineers optimize the

code. The frequent use of development frameworks

contributes to this. Note, that we have explicitly de-

ﬁned our objective to optimize this kind of applica-

tions that is in widespread use, instead of focusing on

the already heavily optimized code bases of for exam-

ple Google and Facebook.

In the introduction we described that energy usage

of data centers is increasing quickly and is becoming

a larger problem on a month-to-month basis. Asso-

ciated with rising energy usage are rising electricity

bills for the operation of data centers. This is caused

not only by the increase in computing power, but also

by the increase in cooling capacity.

At this moment, no serious attention is paid to de-

creasing electricity usage by turning to software op-

timization. Instead, the focus is on making computer

hardware more energy efﬁcient and to decrease the

server installed base by virtualization. We argue that

focusing solely on hardware issues is not enough to

keep the costs of data center operations under control

in the near future. Therefore, it is important that as

many non-essential MIPS as possible are eliminated

from the software code bases running in data centers.

To accomplish this, a quantiﬁcation of where the non-

essential MIPS are located is a prerequisite. This pa-

per provides one such quantiﬁcation.

REFERENCES

Altman, E., Arnold, M., Fink, S., & Mitchell, N. (2010).

Performance analysis of idle programs. In Proceed-

ings of the acm international conference on object ori-

ented programming systems languages and applica-

tions (pp. 739–753). New York, NY, USA: ACM.

Computer Economics, Inc. (2006, April). Best Prac-

tices and Benchmarks in the Data Center. http://

www.computereconomics.com/article.cfm?id=1116.

HipHip for PHP Project. (n.d.). HipHop for PHP. Re-

trieved July 2012, from https://github.com/facebook/

hiphop-php/wiki/

MySQL Project. (n.d.). libmysqld, the Embed-

ded MySQL Server Library. Retrieved July

2012, from http://dev.mysql.com/doc/refman/5.6/en/

libmysqld.html

ObjectWeb Consortium. (n.d.-a). JMOB - RUBBoS Bench-

mark. Retrieved July 2012, from http://jmob.ow2.org/

rubbos.html

ObjectWeb Consortium. (n.d.-b). RUBiS - Home Page. Re-

trieved July 2012, from http://rubis.ow2.org/

Chavan, M., Guravannavar, R., Ramachandra, K., & Sudar-

shan, S. (2011). DBridge: A program rewrite tool

for set-oriented query execution. Data Engineering,

International Conference on, 0, 1284-1287.

Fan, X., Weber, W.-D., & Barroso, L. A. (2007, June).

Power provisioning for a warehouse-sized computer.

SIGARCH Comput. Archit. News, 35, 13–23.

Garrod, C., Manjhi, A., Maggs, B. M., Mowry, T. C., &

Tomasic, A. (2008, October). Holistic Application

Analysis for Update-Independence. In Proceedings

QuantifyingEnergyUsageinDataCentersthroughInstruction-countOverhead

197

of the Second IEEE Workshop on Hot Topics in Web

Systems and Technologies (HotWeb).

Jonathan G. Koomey. (2011, Aug). Growth In Data Center

Electricity Use 2005 To 2010.

Manjhi, A., Garrod, C., Maggs, B. M., Mowry, T. C., &

Tomasic, A. (2009). Holistic Query Transformations

for Dynamic Web Applications. In Proceedings of

the 2009 IEEE International Conference on Data En-

gineering (pp. 1175–1178). Washington, DC, USA:

IEEE Computer Society.

Nagpurkar, P., Horn, W., Gopalakrishnan, U., Dubey, N.,

Jann, J., & Pattnaik, P. (2008). Workload characteri-

zation of selected JEE-based Web 2.0 applications. In

IISWC (p. 109-118).

Stewart, C., Leventi, M., & Shen, K. (2008). Empirical

examination of a collaborative web application. In

IISWC (p. 90-96).

U.S. Environmental Protection Agency. (2007, Aug). Re-

port to Congress on Server and Data Center Energy

Efﬁciency.

Xu, G., Arnold, M., Mitchell, N., Rountev, A., & Sevitsky,

G. (2009). Go with the ﬂow: proﬁling copies to ﬁnd

runtime bloat. In Proceedings of the 2009 ACM SIG-

PLAN conference on Programming language design

and implementation (pp. 419–430). New York, NY,

USA: ACM.

SMARTGREENS2013-2ndInternationalConferenceonSmartGridsandGreenITSystems

198