COMMUNITY CLUSTER OR COMMUNITY CLOUD?

Utilizing our Own Bare-metal

Xin Fan, Yusuke Wada and Shigeru Kusakabe

Graduate School of Information Science and Electrical Engineering, Kyushu University

744, Motooka, Nishi-ku, Fukuoka city, 819-0395, Japan

Keywords:

Cloud computing, Private cloud, Community cluster, Hadoop, MPI.

Abstract:

The increasing availability of cloud computing technologies enables us to have an option we had not before:

using private cloud as well as using public cloud. In this paper, we report our ongoing work on examin-

ing effectiveness of private cloud computing in an academic setting. Many researchers have examined the

relative computational performance of commercially available public cloud computing offerings using HPC

application benchmarks. As one of the driving forces in using cloud technologies is cost effectiveness, some

researchers have examined public cloud offerings and their HPC environment, a community cluster, from a

view point of cost-performance. Part of the conclusions indicates their community cluster may be favorable

for typical community members. Due to the similar grounds of community cluster, we expect private (or com-

munity) cloud is promising in academic settings. Academic community members may also have interest in

utilization of their resources with a conﬁguration of less constraints compared to public cloud offerings while

receiving beneﬁt of cloud technologies. In this paper, we discuss the situation we are managing a number

of bare-metals and we are deciding whether we conﬁgure the computing resource as a cluster of bare-metal

nodes or as a cluster of virtual machines by using cloud computing technologies. According to our preliminary

evaluation results, while we can easily reinstall and change the software framework on clusters in our private

cloud, we must be ready for occurrence of unexpectedly severe performance degradation.

1 INTRODUCTION

Cloud computing has emerged as a new paradigm for

using computing resources. We do not have the single

deﬁnition of cloud computing so far, but most deﬁni-

tions share common characteristics(Armbrust et al.,

2009):

1. The illusion of inﬁnite computing resources avail-

able on demand, thereby eliminating the need for

cloud computing users to plan far ahead for pro-

visioning;

2. The elimination of an up-front commitment by

cloud users, thereby allowing organizations to

start small and increase hardware resources only

when there is an increase in their needs; and

3. The ability to pay for use of computing resources

on a short-term basis as needed and release them

as unneeded, thereby rewarding conservation by

letting machines and storage go when they are no

longer useful.

The on-demand and pay-as-you-go style seems to of-

fer a ﬂexible and cost-effective method to use com-

puting resources.

From the view point of academic computing,

many researchers have examined the relative compu-

tational performance of commercially available pub-

lic cloud computing offerings using a number of stan-

dard benchmarks and HPC applications. Most studies

used Amazon EC2 as the representative of commer-

cially available could offerings(Jackson et al., 2010),

while we have other options such as private cloud.

Since one of the driving forces in using cloud tech-

nologies is cost performance, some researchers have

also examined public cloud offerings and their HPC

environment, a community cluster, from a view point

of cost-performance(Carlyle et al., 2010). A commu-

nity cluster is a system obtained by a faculty group

and centrally operated by an institution, maintained

for the beneﬁt of the many research groups that own

the nodes in the cluster. Community cluster users gain

peace of mind from the cluster’s operation by profes-

sional IT staff; low overhead from centralized power,

cooling, and data center space; and cost effective-

ness from the combined purchasing power of all clus-

ter owners and strategic sourcing of the cluster hard-

127

Fan X., Wada Y. and Kusakabe S..

COMMUNITY CLUSTER OR COMMUNITY CLOUD? - Utilizing our Own Bare-metal.

DOI: 10.5220/0003450501270130

In Proceedings of the 1st International Conference on Cloud Computing and Services Science (CLOSER-2011), pages 127-130

ISBN: 978-989-8425-52-2

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

ware. From the institutional perspective, community

clusters are cost-effective way for faculty to obtain

HPC resources. In the case-study, researchers at Pur-

due University tried to measure per node hour cost of

cloud offering and the traditional HPC environments,

their community cluster, in doing scientiﬁc comput-

ing. Part of the conclusions indicates their commu-

nity cluster may be favorable for typical community

members. The community cluster of the case study

at Purdue is conﬁgured for scientiﬁc computing. We

consider it is better to ﬂexibly accommodate emerg-

ing computing frameworks such as Hadoop(Hadoop,

) in order to broaden and enhance the advantageous

aspects of community clusters.

Cloud computing technologies offer new styles of

computing in various activities using computing re-

sources including academic activities. The growing

availability of cloud computing technologies enables

us to have an option we had not before: using private

cloud as well as using public cloud offerings. Accord-

ing to (Armbrust et al., 2009), cloud computing is the

sum of SaaS and utility computing, but does not nor-

mally include private cloud, which is the term to refer

to internal data-centers of a business or other organi-

zation that are not made available to the public. From

the view point of economies of scale, cloud systems

of larger scale are more advantageous than those of

smaller scale. While private cloud seems less promis-

ing than public one from this view point, there exist

various factors in making a decision. Due to the sim-

ilar grounds of the community cluster, we expect pri-

vate (or community) cloud can be promising in aca-

demic settings.

In this paper, we discuss the situation we are man-

aging a number of bare-metals and we are choosing

whether we conﬁgure the computing resource as a

cluster of bare metal nodes or as a cluster of vir-

tual machines by using cloud computing technolo-

gies. One of the driving forces other than cost ef-

fectiveness in using cloud technologies is its ﬂexi-

bility. Based on the cloud computing technologies,

we can prepare different kinds of computational envi-

ronment, deploy a speciﬁc environment as we choose

over virtual machines, and release the resource af-

ter the predeﬁned period according to the reservation

schedule.

In this paper, we introduce our ongoing work

on examining practical effectiveness of private cloud

computing in an academic setting. The rest of this pa-

per is organized as follows. Section 2 explains outline

of our private cloud. Section 3 shows our preliminary

evaluation results.

Figure 1: Overview of our private Cloud.

2 OUTLINE OF OUR PRIVATE

CLOUD

In our study, we use a small version of IBM Blue-

Cloud as our private cloud computing platform. Fig-

ure 1 shows the outline of our cloud. Followings are

main features of the cloud:

• Virtualization. In our cloud platform, we can dy-

namically add/delete server machines to/from re-

source pool, if the bare-metal machines are x86

architecture and able to run Xen. In adding a new

server to the resource pool in cloud, we connect

the bare-metal server to the private network of the

cloud. Then, host OS Domain 0 (Dom0) of Xen is

automatically installed through the network boot

mechanism. We can deploy virtual machines over

the host OS machines.

• Provisioning. When a user requests a comput-

ing platform from the cloud portal web page,

he/she can specify the virtual OS image (Domain

U (DomU) of Xen in our platform) and applica-

tions from the menu, in addition to the virtual ma-

chine speciﬁcation such as the number of virtual

CPUs (VCPUs), the amount of memory and stor-

age within the capacity of the cloud resource. In

our cloud, the number of VCPUs is limited within

the number of physical CPUs in order to guaran-

tee the minimum performance of DomU. When

the request is admitted, the requested computing

platform is automatically prepared.

In addition to cloning the virtual machines of the same

machine image For example, our cloud supports auto-

matic set up of a Hadoop programming environment

in fully distributed-mode when provisioning comput-

ing resources. We usually need following steps to set

up a Hadoop environment on a cluster:

1. Installing a base machine image into nodes

2. Installing Java

CLOSER 2011 - International Conference on Cloud Computing and Services Science

128

3. Mapping IP address and hostname of each node

4. Permitting non-password login from the master

machine to all the slave machines

5. Conﬁguring Hadoop on the master machine

6. Copying the conﬁgured Hadoop environment to

all slave machines from the master machine

We explain corresponding steps to set up a

Hadoop environment on our private cloud. First, if we

need to increase the machine resource of our cloud,

we set new bare-metal machines network-bootable

and connect them to the local network of our cloud.

The machines are automatically arranged to be a part

of our cloud. We have to prepare the desired ma-

chine image. Then, we request a Hadoop environment

through the portal, and the following process are ar-

ranged automatically. We need an extra script as a

part of preparation if we want to implement a speciﬁc

conﬁguration in the postscript phase, such as a mas-

ter/slave conﬁguration for the Hadoop environment.

Thus, by adopting private cloud computing, we

can use labor-reducing mechanisms that are not avail-

able in community cluster.

3 PRELIMINARY EVALUATION

In order to evaluate the effectiveness of our private

cloud, we prepared two types of platforms: One is

a cluster of eight bare-metal servers as a representa-

tive of community cluster and the other is a cluster

of eight virtual machines in the private cloud mapped

onto eight bare-metal servers. We used Dell blade

server PowerEdge M600 with Intel Xeon L5410 pro-

cessor as bare-metal servers. We used two software

framework: MPI for numerical computation work-

loads and Hadoop for emerging non-numerical com-

putation workloads.

3.1 MPI

We evaluated a thermal convection solver with

MPICH, an implementation MPI as a numerical

parallel computation workload. The data elements

were generated by Adventure sFlow, one of modules

included in the ADVENTURE project(Kanayama

et al., 2005). ADVENTURE sFlow uses the New-

ton method as the nonlinear iteration, and to compute

the problem at each step of the nonlinear iteration a

stabilized ﬁnite element method is introduced. In this

experiment, we measured execution time in changing

the number of steps.

Figure 2: Clusters on virtual machines / bare-metals for

MPI/Hadoop.

We show the result in Figure 3 and Table 1. As

we see from the results, performance degradation in-

curred by virtualization in our cloud for this bench-

mark are around 20% although virtualization is one

of the inevitable cloud-enabling technologies.

Figure 3: Thermal convection solver execution time.

Table 1: Thermal convection solver execution time (sec).

# steps 10 20 40 60

Bare-Metal 17.33 32.46 54.05 66.52

Virtual Machine 21.62 37.38 62.48 76.20

3.2 Hadoop

We evaluated TestFDSIO benchmark included in the

Hadoop distribution as a workload of emerging par-

allel and distributed applications. Table 2 and Figure

4 show the results. The experiment options were ran-

dom reading 1MB ﬁles, changing the number of ﬁles

10 to 50. As seen from the results, throughput of read-

ing ﬁles in the virtualized environment in our cloud

was constantly degraded to about two-third compared

to that of bare-metal environment.

As another experiment, we evaluated π calculation

included in the Hadoop distribution. We measured ex-

ecution time while changing the number of map tasks.

COMMUNITY CLUSTER OR COMMUNITY CLOUD? - Utilizing our Own Bare-metal

129

Table 2: Throughput of TestFDSIO benchmark (mb/sec)

(random read, ﬁle size 1MB, the number of ﬁles 10 to 50).

# ﬁles 10 20 30 40 50

Bare-Metal 32.57 35.09 33.11 34.04 34.18

Virtual Machine 20.37 21.81 20.83 21.42 20.24

Figure 4: Throughput of TestFDSIO benchmark.

As we can see from the results in Table 3 and Figure 5,

performance degradations of the private cloud version

were very severe and the situation became worse as

the number of map tasks increases. The combination

of the behavior of this MapReduce application and the

low performance of network interfaces of virtual ma-

chines is one of the potential bottleneck. Although

we have a plan of performance debugging to alleviate

the problem, such kind of extra work may degrade the

merit of labor-reducing effect in our private cloud.

Table 3: Execution time for π estimator (sec).

# map tasks 20 40 60 80

Bare-Metal 29.41 37.47 46.57 58.56

Virtual Machine 225.40 465.33 868.00 1119.08

Figure 5: Execution time for π estimator.

4 CONCLUDING REMARKS

Due to cloud computing technologies that are not

available in community cluster, we expect private (or

community) cloud is more promising than community

cluster in some academic settings. While we can eas-

ily reinstall and change the software framework on the

cluster by using labor-reducing mechanisms in private

cloud, the performance degradation may be more se-

vere than expected. While the solution depends on the

user pattern, building cluster of bare-metal machines

seems more rewardful when users are performance-

oriented. Our future work includes automatic perfor-

mance tuning applicable to our private cloud.

REFERENCES

Armbrust, M., Fox, A., Grifﬁth, R., Joseph, A. D., Katz,

R., Konwinski, A., Lee, G., Patterson, D., Rabkin,

A., Stoica, I., and Zaharia, M. (2009). Above the

clouds: A berkeley view of cloud computing. Tech-

nical report, UCB/EECS-2009-28, Reliable Adaptive

Distributed Systems Laboratory.

Carlyle, A. G., Harrell, S. L., Smith, P. M., and Center,

R. (2010). Cost-effective hpc: The community ot the

cloud? 2nd IEEE International Conference on Cloud

Computing Technology and Science, pages 169–176.

Hadoop (—). As of Feb.1, 11.

Jackson, K. R., Ramakrishnan, L., Muriki, K., Canon, S.,

Cholia, S., Shalf, J., Wasserman, H. J., and Wright,

N. J. (2010). Performance analysis of high perfor-

mance computing applications on the amazon web

services cloud. In 2nd IEEE International Conference

on Cloud Computing Technology and Science.

Kanayama, H., Tagami, D., and Chiba, M. (2005). Sta-

tionary incompressible viscous ﬂow analysis by a do-

main decomposition method. Domain Decomposition

Methods in Science and Engineering XVI, pages 611–

618.

CLOSER 2011 - International Conference on Cloud Computing and Services Science

130