An increasing amount of computing power is now
hosted on cloud platforms such as Amazon Elas-
tic Compute (EC2), Google Compute Engine or Mi-
crosoft Azure and more and more software and ser-
vices are being hosted in the cloud. Many indepen-
dent software vendors started to offer cloud services
to scale scientific applications for specific industries.
Companies like Rescale, Ciespace, Ubercloud, Sabal-
core, Univa, Penguin Computing provide cloud ser-
vices for weather research, computational fluid dy-
namics, structural mechanics, quantitative finance,
combustion, electromagnetics and molecular dynam-
ics (Gentzsch, 2012). They offer access to VMs with
pre-installed software or a web portal where the sci-
entific applications, such as ANSYS, Gromacs, Open-
FOAM or Gerris solvers (Popinet, 2003) can be exe-
cuted.
While this may be a satisfying solution for some, the
ability to run custom or third-party software on the
cloud infrastructure requires much more complicated
procedures. A scientific application that needs to be
deployed in the cloud usually consists of a command
line-tool that requires complex deployment steps such
as installation, configuration as well as setting up sys-
tem specific services and network policies, compara-
ble to the effort of administering an on-premises clus-
ter. Access to the deployed applications from a lo-
cal machine is also a tedious procedure and requires a
significant amount of effort. This partly explains why
this scenario is often too cumbersome for a domain
engineer or a scientist.
A simple mechanism is needed to simplify the
whole process and lower the entry barrier for cloud-
based numerical simulations. Such mechanisms exist
for Amazon EC2 or Eucalyptus but the second biggest
public cloud provider, i.e. Microsoft Azure still lacks
an easy-to-use framework to simplify cloud orchestra-
tion, and this is the reason why we targeted this cloud
provider.
In this paper, we present a unified framework,
named ”SimplyHPC”, that greatly simplifies the use
of a distributed application on Microsoft Azure. The
framework combines Azure specific HPC libraries,
deployment tools, machine and MPI configuration
into a common platform. Its functionality is, among
others, exposed through PowerShell commandlets
that ease the submission process while keeping the de-
sired command line environment and flexibility. The
framework provides tools with the ability to elasti-
cally deploy an arbitrary number of virtual machines
dynamically, submit the packed application together
with input and configuration files, execute it as a cloud
service using MPI and, once the results are ready,
download them to the local machine and stop the vir-
tual machines. The results presented here is a fol-
low up research of our recent studies (Miroslaw et al.,
2015)
Our paper focuses on the following aspects. Sec-
tion 3 describes the main components of the proposed
framework as well as the platform architecture. Sec-
tion 4 demonstrates the utility of the tool on two scien-
tific applications, namely PETSc and HPCG. In addi-
tion we present the scalability study of ANSYS CFX,
a commercial fluid dynamics code for realistic, indus-
trial simulations. The paper ends with conclusions
and future plans.
2 BACKGROUND
This section examines the typical impediments in de-
ployment of HPC applications. It also briefly exam-
ines related technologies and introduces the platforms
and technologies used in performance studies in the
cloud and in the on-premises cluster.
Due to the fact that the cloud providers are com-
mercial entities that compete on the market, it is very
difficult to create a single orchestrator that supports
the major cloud platforms. The cloud infrastructure
changes very quickly, the new APIs, services and
tools are released frequently and addressing them in
one consistent software is very difficult. For example
existing libraries such as Apache jclouds or libcloud
do not support HPC functionality based on Microsoft
technologies. This is the reason why we decided to
target the Microsoft Azure platform with our cloud
orchestrator.
Cloud orchestrators, such as the one presented in
this paper, manage the interactions and interconnec-
tions among cloud-based and on-premises units as
well as expose various automated processes and as-
sociated resources such as storage and network. They
have become a desired alternative to standard deploy-
ment procedures because of lower level of expertise
and reduced deployment time. Also their ability to
perform the vertical and horizontal scaling is funda-
mental to the adoption of the framework in HPC sce-
narios.
HPC Pack is a cloud orchestrator developed by Mi-
crosoft for monitoring, executing and running jobs in
both on-premises and in the cloud. It exposes func-
tionality that is typical for cluster management soft-
ware such as deployment of clusters with different
configurations, a scheduler that maximizes an utiliza-
tion of the cluster based on priorities, policies and
usage patterns and a batch system for submission
of multiple jobs with different amount of resources.
The framework also allows for deployment of hy-
IoTBD 2016 - International Conference on Internet of Things and Big Data
292