DYNAMIC RESOURCE PROVISIONING FOR SELF-ADAPTIVE

HETEROGENEOUS WORKLOADS IN SMP HOSTING PLATFORMS

Ramon Nou, Ferran Juli

a, Jordi Guitart and Jordi Torres

Barcelona Supercomputing Center(BSC), Technical University of Catalonia (UPC), Barcelona, Spain

Keywords:

Autonomic Computing, resource provisioning, heterogeneous workloads.

Abstract:

We introduce a novel approach that allows heterogeneous applications run together on a shared hosting plat-

form, dynamically sharing the platform’s resources. The proposed approach has been validated by a proof-

of-concept prototype which uses a global processor manager to distribute the platform’s processors among

two (or more) heterogeneous applications, i.e. a Tomcat application server and a Globus grid middleware.

Our evaluation demonstrates the beneﬁt of including bidirectional communication between applications and

the OS for efﬁciently managing the resources and preventing the degradation of an applications performance,

especially when the hosting platform is fully overloaded. For the sake of simplicity, we have modiﬁed the

applications so that they communicate with the resource manager, although other techniques can be applied

to avoid these modiﬁcations. Running different applications in a shared platform and being able to assign

priorities between them provides important beneﬁts.

1 INTRODUCTION

The consolidation of distributed and grid computing

has been accompanied with the appearance of new

computing models oriented to these environments.

One of them is the utility computing model, in which

applications run on hosting platforms that rent their

resources to them. Application owners pay for plat-

form resources, and in return, the application is pro-

vided with guarantees of resource availability and

quality of service (QoS), which can be expressed in

the form of a service level agreement (SLA). The

hosting platform is responsible for providing sufﬁ-

cient resources to each application to meet its work-

load, or at least to satisfy the agreed QoS. These host-

ing platforms must be able to provide resources to a

heterogeneous set of applications, which range from

web applications (e.g. an application server attend-

ing a transactional workload) to traditional scientiﬁc

computations in the grid community. The traditional

approach used by hosting platforms for provision of

resources to heterogeneous applications is to consider

a separate set of the cluster nodes for each applica-

tion (dedicated model) (Appleby et al., 2001). In

this model, resource allocation is performed with the

granularity of a full cluster node and the provision-

ing technique must determine how many nodes to al-

locate to each application. However, economic rea-

sons of space, power, cooling and cost can encourage

the use of the shared model (Chandra et al., 2003a),

in which node resources can be shared among multi-

ple applications and the provisioning technique needs

to determine how to partition resources on each node

among competing applications. We introduce a novel

approach to allow heterogeneous applications to run

together in a shared hosting platform, dynamically

sharing the platform’s resources (we focus on CPUs

in this proof-of-concept prototype) while maintaining

good performance. This paper extends the work per-

formed in (Guitart et al., 2006). The previous paper

proposed a global strategy for preventing the over-

loading of web applications and efﬁciently utilizing

a platform’s resources in a shared hosting platform

running homogeneous web applications. The pro-

posed strategy exploits the beneﬁts of dynamically re-

allocating resources among hosted applications based

on the variations in their workloads. These beneﬁts

have been described in recent studies (Appleby et al.,

Nou R., Julià F., Guitart J. and Torres J. (2007).

DYNAMIC RESOURCE PROVISIONING FOR SELF-ADAPTIVE HETEROGENEOUS WORKLOADS IN SMP HOSTING PLATFORMS.

In Proceedings of the Second International Conference on e-Business, pages 39-44

DOI: 10.5220/0002110900390044

 SciTePress

2001; Chandra et al., 2003a; Chandra et al., 2003b).

The goal is to meet the applications requirements on

demand and adapt to their changing resource needs.

This requires an accurate collaboration between dy-

namic resource provisioning and admission control

mechanisms. The paper is structured as follows: Sec-

tion 2 shows our prototype and how it works. Sec-

tion 3 describes the experimental environment used in

our evaluation. Section 4 evaluates the results we ob-

tained. Section 5 explores some work done in the area

of dynamic resource provisioning for variable work-

loads. Finally, section 6 presents our conclusions and

future work.

2 RESOURCE PROVISIONING

STRATEGY

In this section, we present a summary of the ba-

sic guidelines of our resource provisioning strategy.

Our proposal is based on a global processor man-

ager, called eDragon CPU Manager (ECM), respon-

sible for periodically (conﬁgurable) distributing the

available processors among the different applications

running in a hosting platform. Further details can be

found in (Guitart et al., 2006). We manage only pro-

cessors because are the more limited resource in the

scenario we are studying. The ECM cooperates with

the applications to efﬁciently manage the processors

and prevent applications getting overloaded using bi-

directional communication. On one side, the applica-

tions periodically request from the ECM the number

of processors needed to handle their incoming load

while avoiding degradation in the QoS. We deﬁne the

number of processors requested by an application i

as R

. On the other side, the ECM can be requested

at any time by the applications to inform them about

their processor assignments. We deﬁne the number of

processors allocated to application i as A

. With this

information, the applications can adapt their behavior

to the allocated processors, avoiding in this way the

degradation of their QoS. Figure 1 shows a diagram

describing our resource provisioning strategy.

2.1 eDragon Cpu Manager

The eDragon CPU Manager (ECM) is responsible for

the distribution of processors among applications in

the hosting platform. The ECM is implemented as a

user-level process that wakes up periodically at a ﬁxed

time quantum, deﬁned as k

ECM

, examines the current

requests of the applications and distributes processors

according to a scheduling policy. With this conﬁgu-

ration, direct modiﬁcation of the native kernel is not

Figure 1: Prototype structure with ECM and two applica-

tions.

required to show the usefulness of the proposed envi-

ronment.

Traditionally, resource allocation policies have

considered conventional performance metrics such as

response time, throughput and availability. However,

the metrics that are of utmost importance to the man-

agement of an e-commerce site are revenue and prof-

its and should be incorporated when designing poli-

cies (Cherkasova and Phaal, 2002). For this reason,

the ECM can implement policies considering conven-

tional performance metrics as well as incorporating e-

business indicators, also policies like the ones in Sec-

tion 5 can also be incorporated to the ECM. Our sam-

ple policy includes priority classes. The priority class

indicates a customer domain’s priority in relation

to other customer domains. It is expected that high

priority customers will receive preferential service re-

spect low priority customers. In our policy, at every

sampling interval k

ECM

, each application i receives

a number of processors (A

ECM

)) that is propor-

tional to its request (R

ECM

)) pondered depending

on the application’s priority class (P

) and the num-

ber of processors in the hosting platform (NC pus),

and inversely proportional to the total workload of the

system (

∑

∗ R

ECM

)), expressed as the sum of re-

quests of all applications in the system. The schedul-

ing policy should also allow us to achieve the high-

est resource utilization in the hosting platform. Our

proposal to accomplish this with the ECM is based

on sharing processors among the applications under

certain conditions (minimizing the impact on perfor-

mance isolation).

ICE-B 2007 - International Conference on e-Business

The ECM not only decides how many processors

to assign to each application, but also which proces-

sors to assign to each application. In order to accom-

plish this, the ECM conﬁgures the CPU afﬁnity mask

of each application (using the Linux sched setafﬁnity

function) so that the processors allocations to the dif-

ferent applications do not overlap (except if one pro-

cessor is shared), in this way minimizing the perfor-

mance interference among applications.

3 EXPERIMENTAL

ENVIRONMENT

We have Tomcat v5.0.19 (Amza et al., 2002) and a

Globus GT 4.0.1 (Sotomayor and Childers, 2005) in

the same node. Tomcat is an open-source servlet con-

tainer developed under the Apache license. Its pri-

mary goal is to serve as a reference implementation

of the Sun Servlet and JSP speciﬁcations, and to be a

quality production servlet container too.

The client workload for the experiments was gen-

erated using a workload generator and web perfor-

mance measurement tool called Httperf (Crovella

et al., 1999) using RUBiS (Rice University Bidding

System) (Coarfa et al., 2002) benchmark servlets as

an application. The Tomcat instance has a variable

input load throughout the run time, which is shown in

the top subﬁgure of Figure 2 which displays the num-

ber of new clients per second that hit the server as a

function of the time. Input load distribution has been

chosen in order to represent the different processor re-

quirement combinations when running with the Grid

workload in the hosting platform.

The Globus server is a standard de facto of Grid

middleware. We didn’t make any modiﬁcations of

its parameters (number of ServiceThreads or number

of Runqueues) on the standard tests (with ECM they

can be dynamically modiﬁed). The Globus workload

generator is sending jobs that overload or stress the

management code for a job. We assume that the job

will be executed on another node or cluster. From the

Globus workload generator we generate and submit

jobs with an increasing throughput and try to execute

them on the nodes of the cluster (simulated, so CPU

requeriments are only the ones to prepare the job). We

then measure the output throughput. This gives a wide

range of situations that are summarized in Figure 2.

In our case, we selected different submission levels

for Globus and several different levels of arrival rates

for Tomcat to give a wide view of conﬁgurations and

situations to show the beneﬁts of our approach. The

hosting platform is a 4-way Intel XEON 1.4 GHz with

2 GB RAM.

For the purpose of this paper, we present a man-

aged middleware prototype that allows the execu-

tion of heterogeneous applications in the same host-

ing platform. We show how the communication be-

tween applications and OS layer can provide great im-

provements in performance terms. For this proof-of-

concept we are considering web and grid applications.

For simplicity in the prototype, we are using Tom-

cat for the web workload and the Globus platform for

the grid workload; all of them are well known plat-

forms and widely used. In this prototype we are mod-

ifying the applications in order to communicate with

ECM, but we could use other mechanism to avoid

these modiﬁcation (i.e. use a proxy). Further details

of the architecture and modiﬁcations done on Tom-

cat and Globus, can be seen on (Guitart et al., 2006)

for Tomcat and (Nou et al., 2007a; Nou et al., 2007b;

IBM-Corporation, 2004) for Globus Toolkit.

4 EVALUATION

Our evaluation will show the beneﬁts of our proposal

for managing the resources efﬁciently and prevent-

ing server overload on a 4-way multiprocessor Linux

hosting platform. The CPU requeriments to prepare a

globus job are ten times the requeriments to process a

Tomcat request (with SSL handshake) or 100 times a

Tomcat request without SSL handshake.

4.1 Standard Tomcat and Globus

In Figure 2, we can see in grid style how the sys-

tem evolves (bottom two plots) when we are sub-

mitting different workloads (top plot) to Tomcat and

Globus. We have divided the test in order to check

different CPU requirement scenarios in the two mid-

dleware. Globus is submitting the jobs with the in-

creasing workload explained in Section 3, while Tom-

cat is generating several load levels from overloaded

to non-overloaded arrival rates. As we can see in the

top plot, we can divide the test into two parts; one

workload with low load and another workload where

the system is under a heavy load. If we take a closer

view of Tomcat we can see that when the server is

overloaded the reply/rate falls (i.e. around 4000 sec-

onds in grid style). We can ﬁnd another zone in the

Tomcat plot of similar behaviour: in the 1500 seconds

point the throughput obtained from Tomcat is very

high, however when we look at the 3000 seconds zone

we can see that the throughput obtained is lower with

an higher input workload. Switching to the Globus

plot there are a lot of zones where no jobs are ﬁnished

(2800-3600 or 4000-5000). Globus is getting a very

DYNAMIC RESOURCE PROVISIONING FOR SELF-ADAPTIVE HETEROGENEOUS WORKLOADS IN SMP

HOSTING PLATFORMS

Figure 2: From top to bottom; workload of Tomcat and Globus, replies/sec of Standard Tomcat compared with the replies/sec

of Tomcat using ECM, throughput in Standard Globus compared with throughput using ECM with Globus. The two applica-

tions are running at once using ECM facilities.

high workload, but as we will show in the next sub-

section we can increase its performance using ECM.

The server needs to share the resources between the

two applications, but such kinds of applications don’t

know in which environment and with what kinds of

resources they are being executed. They are compet-

ing for a set of limited resources and getting worse

performance than if they were divided onto two ma-

chines with the resources divided in half. Giving and

getting information about the resources that the appli-

cation consumes and that the system have available

for it should be necessary in order to avoid this situa-

tions.

4.2 Adaptative Middleware with ECM

If we repeat the last test with ECM (without priori-

ties), we obtain the solid style results in Figure 2. We

can see how the system overall is working better. And

the most important thing is that we didn’t get the low

levels of replies/sec on Tomcat and the low number of

ﬁnished jobs on Globus that we obtained in the previ-

ous subsection.

In some zones we can see how Tomcat is working

better than before, as long as ECM receives a request

of processing power from Tomcat, ECM tells Tom-

cat how many CPUs has assigned. Tomcat is able to

overcome these situations and start its admission con-

trol to stabilize itself at the desired level. It is the case

of the zones near 1000 seconds and the ranges from

2200-3500 and 4000-5500 where the system without

ECM is obtaining lower performance than with ECM.

In this zones as long as Tomcat (and Globus) knows

how many resources they have available can adapt his

behaviour to the new scenario. We can notice this

looking at the globus side, where we are getting more

throughput than before also. Getting into these situa-

tions, on 1000 seconds zone Globus is working at the

same level than before but Tomcat as it knows how

many resources it has available can adapt his load to

the new scenario. The reverse situations happens on

2500 seconds zone, Globus is improving its perfor-

mance. When the two applications are overloaded a

communication with the OS to know the available re-

sources can provide an improvement over the two ap-

plications as we can see after 4000 seconds zone. n

the other hand we can modify the fairness of the re-

source assignment using priorities dependending, i.e.

the revenue of the application, using ECM. When the

applications are in the same priority category inside

ECM, they are sharing resources without any prefer-

ence between applications. In an entry node with a

secure connection scenario, like the one we are test-

ing, it’s crucial to provide more fairness to the several

ICE-B 2007 - International Conference on e-Business

middleware that share the resources on the node to get

better results as a whole.

5 RELATED WORK

Recent studies (Andrzejak et al., 2002; Chandra et al.,

2003b; Chandra and Shenoy, 2003) have reported

the considerable beneﬁt of dynamically adjusting re-

source allocations to handle variable workloads. This

premise has motivated the proposal of several tech-

niques to dynamically provision resources to appli-

cations in on-demand hosting platforms. Depend-

ing on the mechanism used to decide the resource

allocations, these proposals can be classiﬁed into:

control theoretic approaches with a feedback ele-

ment (Abdelzaher et al., 2002), open-loop approaches

based on queuing models to achieve resource guar-

antees (Chandra et al., 2003a; Doyle et al., 2003;

Liu et al., 2001) and observation-based approaches

that use runtime measurements to compute the rela-

tionship between resources and a QoS goal (Pradhan

et al., 2002). Control theory solutions require train-

ing the system at different operating points to deter-

mine the control parameters for a given workload.

Queuing models are useful for steady state analysis

but do not handle transients accurately. Observation-

based approaches are most suited for handling vary-

ing workloads and non-linear behaviors. Resource

management in a single machine has been covered

in (Banga et al., 1999), where authors proposed to use

resource containers as an operating system abstrac-

tion to embody a resource. In (Liu et al., 2005) au-

thors proposes the design of online feedback control

algorithms to dynamically adjust entitlement values

for a resource container on a server shared by multiple

applications. The problem of provisioning resources

in cluster architectures has been addressed in (Ap-

pleby et al., 2001; Ranjan et al., 2002) by allocat-

ing entire machines (dedicated model) and in (Chan-

dra et al., 2003a; Pradhan et al., 2002; Uragonkar

and Shenoy, 2004) by sharing node resources among

multiple applications (shared model). Cataclysm (So-

tomayor and Childers, 2005) performs overload con-

trol by bringing together admission control, adaptive

service degradation and dynamic provisioning of plat-

form resources, demonstrating that the most effective

way to handle overloading must consider a combina-

tion of techniques. In this aspect, that work is similar

to our proposal. There are also approaches (Menasc

2005) that use virtualized environments and analyti-

cal methods to adjust the resources allocated to the

virtualized systems. R-Opus (Cherkasova and Rolia,

2006) works on a different layer and scale of time.

In our approach we focus on a single server machine

which shares different applications and has a low time

scale. Also, giving more processing power to an ap-

plication, such as Tomcat (for example), will not di-

rectly produce better performance. The application

needs to know how many resources it has available.

6 CONCLUSIONS

In this paper we have presented a proof-of-concept

prototype for demonstrating that bidirectional com-

munication between applications and OS can pro-

vide that heterogeneous applications (running in over-

loaded conditions) can run together in a shared host-

ing platform and at the same time maintain their per-

formance. Using a shared hosting platform reduces

important costs like space and power.

Our approach is based on implementing a global

resource manager, responsible for periodically dis-

tributing the available processors between the appli-

cations following a determined policy. The resource

manager can be conﬁgured to implement different

policies, and consider traditional indicators (i.e. re-

sponse time) as well as e-business indicators (i.e. cus-

tomer’s priority). In our proposal, the resource man-

ager and the applications cooperate to manage the re-

sources, in a manner totally transparent to the user,

using bi-directional communication. On one side, the

applications request from the resource manager the

number of processors needed to handle their incom-

ing load without QoS degradation. On the other side,

the resource manager can be requested at any time by

the applications to inform them about their processor

assignments. With this information, applications can

adapt their behavior to the allocated processors.

Our evaluation demonstrates the beneﬁt of our ap-

proach for managing resources efﬁciently and for pre-

venting degradation of an applications performance

on shared hosting platforms. Although our imple-

mentation targets Tomcat and Globus, the proposed

strategy can be applied with any other platform or

application. Further improvements can be made on

this proof-of-concept work: more ﬁne grained assign-

ments of CPU or a fairer Globus self-management ob-

jective. Our future work considers the use of virtual-

ization technologies.

ACKNOWLEDGEMENTS

This work is supported by the Ministry of Science

and Technology of Spain and the European Union

DYNAMIC RESOURCE PROVISIONING FOR SELF-ADAPTIVE HETEROGENEOUS WORKLOADS IN SMP

HOSTING PLATFORMS

under contract TIN2004-07739-C02-01 and Commis-

sion of the European Communities under IST contract

034286 (SORMA). Thanks to Mario Macias for his

help.

REFERENCES

Abdelzaher, T., Shin, K., and Bhatti, N. (2002). Per-

formance guarantees for web server end-systems: A

control-theoretical approach. IEEE TPDS, 13(1):80–

96.

Amza, C., Cecchet, E., Chanda, A., Cox, A., Elnikety,

S., Gil, R., Marguerite, J., Rajamani, K., and

Zwaenepoel, W. (2002). Speciﬁcation and implemen-

tation of dynamic web site benchmarks. WWC-5,

Austin, Texas, USA.

Andrzejak, A., Arlitt, M., and Rolia., J. (2002). Bound-

ing the resource savings of utility computing models.

HPL-2002-339, HP Labs.

Appleby, K., Fakhouri, S., Fong, L., Goldszmidt, G., Kr-

ishnakumar, S., Pazel, D., Pershing, J., and Rochw-

erger, B. (2001). Oceano :SLA-based management of

a computing utility. IM 2001, Seattle, Washington,

USA, pages 855–868.

Banga, G., Druschel, P., and Mogul, J. C. (1999). Resource

containers: A new facility for resource management

in server systems. OSDI’99,New Orleans, Louisiana,

USA., pages 45–58.

Chandra, A., Gong, W., and Shenoy, P. (2003a). Dynamic

resource allocation for shared data centers using on-

line measurements. IWQoS 2003,Berkeley, Califor-

nia, USA., pages 381–400.

Chandra, A., Goyal, P., and Shenoy, P. (2003b). Quantifying

the beneﬁts of resource multiplexing in on-demand

data centers. Self-Manage 2003, San Diego, Califor-

nia, USA.

Chandra, A. and Shenoy, P. (2003). Effectiveness of dy-

namic resource allocation for handling internet ﬂash

crowds. TR03-37, Department of Computer Science,

University of Massachusetts, USA.

Cherkasova, L. and Phaal, P. (2002). Session-based admis-

sion control: A mechanism for peak load management

of commercial web sites. IEEE Transactions on Com-

puters, 51 (6):669–685.

Cherkasova, L. and Rolia, J. (2006). R-Opus: A composite

framework for application performability and qos in

shared resource pools. In DSN’06, pages 526–535,

Washington, DC, USA.

Coarfa, C., Druschel, P., and Wallach, D. (2002). Perfor-

mance analysis of TLS web servers. NDSS’02,San

Diego, California, USA.

Crovella, M., Frangioso, R., and Harchol-Balter, M. (1999).

Connection scheduling in web servers. USITS’99,

Boulder, Colorado, USA.

Doyle, R., Chase, J., Asad, O., Jin, W., and Vahdat, A.

(2003). Model-based resource provisioning in a web

service utility. USITS’03, Seattle, Washington, USA.

Guitart, J., Carrera, D., Beltran, V., Torres, J., and Ayguad

E. (2006). Preventing secure web applications over-

load through dynamic resource provisioning and ad-

mission control. UPC-DAC-RR-2006-37.

IBM-Corporation (2004). An architectural blueprint for au-

tonomic computing. http://www.ibm.com/autonomic.

Liu, X., Zhu, X., Singhal, S., and Arlitt, M. (2005).

Adaptive entitlement control to resource containers on

shared servers. IM 2005, Nice, France.

Liu, Z., Squillante, M., and Wolf, J. (2001). On maximiz-

ing service-level-agreement proﬁts. EC 2001, Tampa,

Florida, USA., pages 213–223.

Menasc

e, D. A. (2005). Virtualization: Concepts, applica-

tions, and performance modeling. Int. CMG Confer-

ence, Orlando, Florida, USA, pages 407–414.

Nou, R., Juli

a, F., and Torres, J. (2007a). The need for self-

managed access nodes in grid environments. EASe

2007, Tucson, Arizona, USA.

Nou, R., Juli

a, F., and Torres, J. (2007b). Should the

grid middleware look to self-managing capabilities?

ISADS 2007, Sedona, Arizona, USA.

Pradhan, P., Tewari, R., Sahu, S., Chandra, A., and

Shenoy, P. (2002). An observation-based approach to-

wards self-managing web servers. IWQoS 2002,Mi-

ami Beach, Florida, USA., pages 13–22.

Ranjan, S., Rolia, J., Fu, H., and Knightly, E. (2002).

Qos-driven server migration for internet data centers.

IWQoS 2002, Miami Beach, Florida, USA., pages 3–

12.

Sotomayor, B. and Childers, L. (2005). Globus Toolkit 4 :

Programming Java Services. Morgan Kaufmann.

Uragonkar, B. and Shenoy, P. (2004). Cataclysm: Han-

dling extreme overloads in internet services. TR03-40,

Department of Computer Science, University of Mas-

sachusetts, USA.

ICE-B 2007 - International Conference on e-Business