2 RELATED WORK
In the course of hybrid system exploration we are
working on two applications, which are outlined in
this section, and will be used later in this paper to ex-
emplify the application of the proposed model. The
first application under investigation implements ac-
celeration of compute-intensive tasks for insurance
product calculations (Grosser et al., 2009). Compute-
intensivetasks are executed by network-attached IBM
POWER blades. Acceleration is achieved through
leveraging parallelism and calculating insurance al-
gorithms by just-in-time compiled code snippets on
the attached blades. Requirements towards availabil-
ity are met by tightly controlling execution from the
host’s off-loading facility.
The second application scenario focuses on nu-
meric and data-intensive calculations of seismic mod-
eling problems. Seismic algorithms are parallelized
on various types of accelerators, each having its
advantages and disadvantages (Clapp et al., 2010).
FPGAs can be exploited efficiently, by implement-
ing pipeline-based stream processing architectures.
Besides FPGAs, we also investigate in CPU SIMD
processing, threads, MPI, and OpenCL. By using
OpenCL we are moreover able to evaluate multiple
architectures, like different types of CPUs and GPUs.
The focus of this application is to have a broad range
of accelerators and programming environments, in or-
der to evaluate different compositions of a hybrid sys-
tem and its resulting performance.
3 PROBLEM STATEMENT
In the cloud computing anatomy there are three lay-
ers to deliver infrastructure, platform, and software as
a service, i.e. IaaS, PaaS, and SaaS, respectively (Va-
quero et al., 2008). To service infrastructure, phys-
ical deployment methods are used to implement the
established cloud services. Infrastructure deployment
methods are realized using model-based approaches,
which enable abstraction from the underlying hard-
ware. Today, no cloud provisioning models exist for
offering multiple adapted accelerator environments
1
:
complexity is imposed on the administrator to build
workload optimized solutions based on resources of
various types and architectures. A composite system
built from host resources, connectivity, and accelera-
tor resources needs to be provisioned as a single entity
in order to establish hybrid infrastructure. Abstrac-
tion of the physics allows for both accelerator sharing
1
Meanwhile, Amazon EC2 offers Nvidia-based GPU com-
pute racks (http://aws.amazon.com/ec2/hpc-applications/).
and flexibility of the cloud service provider. Isolation
needs to be ensured to create a multi-tenancy capable
landscape.
Table 1 lists the manifoldness of an arbitrary het-
erogeneous data center. The table allows to create any
combination of host, accelerator, fabric, including the
desired programming platform.
Table 1: Each possible combination imposes challenges to
handle hosts and interrelationships to multiple accelerators.
In addition, the used library must be adapted to the under-
lying system and interconnect.
Host Accelerator Fabric Library
i386 Cell/B.E. Ethernet OpenCL
x86 64 GPGPU InfiniBand MPI
s390 Cluster PCIe DaCS/ALF
POWER FPGA/DRP Myrinet SOAP
Itanium DataPower Quadrics CUDA
Manufacturers develop all kinds of multi-core
and many-core accelerators, each having its targeted
application domain. This includes, among others,
multi-core CPUs, GPGPUs, Cell/B.E., and FPGAs.
Arising standards for heterogeneous computing, like
OpenCL, enable to use GPGPUs for number crunch-
ing. An OpenCL implementation is provided, among
others, for Nvidia GPUs (NVIDIA, 2010), Cell/B.E.,
and POWER (IBM, 2010a). OpenCL enables easy
parallelization over a device’s compute units, exploit-
ing data-parallel computing power. In order to scale
an accelerated application, multiple devices must be
incorporated, thus additional management function-
ality is needed in the host code to queue opera-
tions efficiently. As another example, some appli-
cation domains fit efficiently onto FPGAs, eventu-
ally forming appliances. Multiple access to these re-
sources must also be shared. Regarding the fabric,
in HPC, InfiniBand may be preferred over Ethernet.
Traditional Ethernet networking allows to send mes-
sages, while InfiniBand channel adapters also provide
RDMA capabilities. Also, Ethernet adapters may be
equipped with RDMA capabilities to reduce commu-
nication overhead, though. Unlike networking, PCIe
is an attachment approach where devices are accessi-
ble through the memory-mapped PCI address space.
Thus, the use of PCIe-attached accelerators is differ-
ent from network-attached ones. From an infrastruc-
ture point of view, there is a variety of available hosts,
available accelerators, fabrics, programming environ-
ments, frameworks, and libraries. In the following
sections, we introduce the model and discuss how ex-
ample workloads benefit from hybrid infrastructure.
HYBRID INFRASTRUCTURE AS A SERVICE - A Cloud-oriented Provisioning Model for Virtual Hosts and
Accelerators in Hybrid Computing Environments
171