Authors:
Paul Marshall
1
;
Henry Tufo
2
;
Kate Keahey
3
;
David La Bissoniere
3
and
Matthew Woitaszek
4
Affiliations:
1
University of Colorado, United States
;
2
University of Colorado at Boulder and National Center for Atmospheric Research, United States
;
3
Argonne National Laboratory and University of Chicago, United States
;
4
National Center for Atmospheric Research, United States
Keyword(s):
Infrastructure-as-a-Service, Cloud Computing, Elastic Computing, Recontextualization.
Related
Ontology
Subjects/Areas/Topics:
Agents
;
Artificial Intelligence
;
Context
;
Context Sensitive Applications
;
Paradigm Trends
;
Service Technology and Infrastructure Issues
;
Services
;
Services Applications
;
Software Engineering
Abstract:
Infrastructure-as-a-service (IaaS) clouds, such as Amazon EC2, offer pay-for-use virtual resources on-demand. This allows users to outsource computation and storage when needed and create elastic computing environments that adapt to changing demand. However, existing services, such as cluster resource managers (e.g. Torque), do not include support for elastic environments. Furthermore, no recontextualization services exist to reconfigure these environments as they continually adapt to changes in demand. In this paper we present an architecture for a large-scale elastic cluster environment. We extend an open-source elastic IaaS manager, the Elastic Processing Unit (EPU), to support the Torque batch-queue scheduler. We also develop a lightweight REST-based recontextualization broker that periodically reconfigures the cluster as nodes join or leave the environment. Our solution adds nodes dynamically at runtime and supports MPI jobs across dis-tributed resources. For experimental evalua
tion, we deploy our solution using both NSF FutureGrid and Amazon EC2. We demonstrate the ability of our solution to create multi-cloud deployments and run batch-queued jobs, recontextualize 256 node clusters within one second of the recontextualization period, and scale to over 475 nodes in less than 15 minutes.
(More)