Modeling a Load-adaptive Data Replication in Cloud Environments

Julia Myint, Axel Hunger


Replication is an essential cornerstone of cloud storage where 24x7 availability is needed. Failures are normal rather than exceptional in the cloud computing environments. Aiming to provide high reliability and cost effective storage, replicating based on data popularity is an advisable choice. Before committing a service level agreement (SLA) to the customers of a cloud, the service provider needs to carry out analysis of the system on which cloud storage is hosted. Hadoop Distributed File System (HDFS) is an open source storage platform and designed to be deployed in low-cost hardware. PC Cluster based Cloud Storage System is implemented with HDFS by enhancing replication management scheme. Data objects are distributed and replicated in a cluster of commodity nodes located in the cloud. In this paper, we propose a Markov chain model for replication system which is able to adapt the load changes of cloud storage. According to the performance evaluation, the system can be able to adapt the different workloads (i.e data access rates) while maintaining the high reliability and long mean time to absorption.


