Authors:
Valentina Salapura
and
Richard Harper
Affiliation:
IBM T. J. Watson Research Center, United States
Keyword(s):
Cloud Computing, High Availability, Virtualization, Automation, Enterprise Class.
Related
Ontology
Subjects/Areas/Topics:
Cloud Computing
;
Cloud Computing Enabling Technology
;
Cloud Ilities (Scalability, Availability, Reliability)
;
Cloud Optimization and Automation
;
Monitoring of Services, Quality of Service, Service Level Agreements
;
Virtualization Technologies
Abstract:
In this paper, we outline and illustrate concepts that are essential to achieve fast, highly scalable virtual
machine planning and failover at the Virtual Machine (VM) level in a data center containing a large number
of servers, VMs, and disks. To illustrate the concepts a solution is implemented and analyzed for IBM’s
Cloud Managed Services enterprise cloud. The solution enables at-failover-time planning, and keeps the
recovery time within tight service level agreement (SLA) allowed time budgets via parallelization of
recovery activities. The initial serial failover time was reduced for an order of magnitude due to parallel VM
restart, and to parallel VM restart combined with parallel storage device remapping.