Unifying Data and Replica Placement for Data-intensive Services in Geographically Distributed Clouds

Ankita Atrey, Gregory Van Seghbroeck, Higinio Mora, Filip De Turck, Bruno Volckaert

Abstract

The increased reliance of data management applications on cloud computing technologies has rendered research in identifying solutions to the data placement problem to be of paramount importance. The objective of the classical data placement problem is to optimally partition, while also allowing for replication, the set of data-items into distributed data centers to minimize the overall network communication cost. Despite significant advancement in data placement research, replica placement has seldom been studied in unison with data placement. More specifically, most of the existing solutions employ a two-phase approach: 1) data placement, followed by 2) replication. Replication should however be seen as an integral part of data placement, and should be studied as a joint optimization problem with the latter. In this paper, we propose a unified paradigm of data placement, called CPR, which combines data placement and replication of data-intensive services into geographically distributed clouds as a joint optimization problem. Underneath CPR, lies an overlapping correlation clustering algorithm capable of assigning a data-item to multiple data centers, thereby enabling us to jointly solve data placement and replication. Experiments on a real-world trace-based online social network dataset show that CPR is effective and scalable. Empirically, it is  35% better in efficacy on the evaluated metrics, while being up to 8 times faster in execution time when compared to state-of-the-art techniques.

Download


Paper Citation


in Harvard Style

Atrey A., Van Seghbroeck G., Mora H., De Turck F. and Volckaert B. (2019). Unifying Data and Replica Placement for Data-intensive Services in Geographically Distributed Clouds.In Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-365-0, pages 25-36. DOI: 10.5220/0007613400250036


in Bibtex Style

@conference{closer19,
author={Ankita Atrey and Gregory Van Seghbroeck and Higinio Mora and Filip De Turck and Bruno Volckaert},
title={Unifying Data and Replica Placement for Data-intensive Services in Geographically Distributed Clouds},
booktitle={Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2019},
pages={25-36},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007613400250036},
isbn={978-989-758-365-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - Unifying Data and Replica Placement for Data-intensive Services in Geographically Distributed Clouds
SN - 978-989-758-365-0
AU - Atrey A.
AU - Van Seghbroeck G.
AU - Mora H.
AU - De Turck F.
AU - Volckaert B.
PY - 2019
SP - 25
EP - 36
DO - 10.5220/0007613400250036