Design of an RDMA Communication Middleware for Asynchronous Shuffling in Analytical Processing

Rui C. Gonçalves, José Pereira, Ricardo Jimenez-Peris

Abstract

A key component in a distributed parallel analytical processing engine is shuffling, the distribution of data to multiple nodes such that the computation can be done in parallel. In this paper we describe the initial design of a communication middleware to support asynchronous shuffling of data among multiple processes on a distributed memory environment. The proposed middleware relies on RDMA (Remote Direct Memory Access) operations to transfer data, and provides basic operations to send and queue data on remote machines, and to retrieve this queued data. Preliminary results show that the RDMA-based middleware can provide a 75% reduction on communication costs, when compared with a traditional sockets implementation.

References

  1. Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R. R., Bradshaw, R., and Weizenbaum, N. (2010). Flumejava: Easy, efficient data-parallel pipelines. In PLDI 7810: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 363-375.
  2. Dean, J. and Ghemawat, S. (2008). Mapreduce: Simplified data processing on large clusters. Communications of the ACM, 51(1):107-113.
  3. Dragojevic, A., Narayanan, D., Castro, M., and Hodson, O. (2014). FaRM: Fast remote memory. In NSDI 7814: 11th USENIX Symposium on Networked Systems Design and Implementation, pages 401-414.
  4. Stuedi, P., Metzler, B., and Trivedi, A. (2013). jverbs: Ultralow latency for data center applications. In SOCC 7813: Proceedings of the 4th Annual Symposium on Cloud Computing, pages 10:1-10:14.
  5. Wang, Y., Xu, C., Li, X., and Yu, W. (2013). Jvm-bypass for efficient hadoop shuffling. In IPDPS 7813: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pages 569-578.
Download


Paper Citation


in Harvard Style

Gonçalves R., Pereira J. and Jimenez-Peris R. (2016). Design of an RDMA Communication Middleware for Asynchronous Shuffling in Analytical Processing . In Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 1: DataDiversityConvergence, (CLOSER 2016) ISBN 978-989-758-182-3, pages 348-351. DOI: 10.5220/0005923703480351


in Bibtex Style

@conference{datadiversityconvergence16,
author={Rui C. Gonçalves and José Pereira and Ricardo Jimenez-Peris},
title={Design of an RDMA Communication Middleware for Asynchronous Shuffling in Analytical Processing},
booktitle={Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 1: DataDiversityConvergence, (CLOSER 2016)},
year={2016},
pages={348-351},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005923703480351},
isbn={978-989-758-182-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 1: DataDiversityConvergence, (CLOSER 2016)
TI - Design of an RDMA Communication Middleware for Asynchronous Shuffling in Analytical Processing
SN - 978-989-758-182-3
AU - Gonçalves R.
AU - Pereira J.
AU - Jimenez-Peris R.
PY - 2016
SP - 348
EP - 351
DO - 10.5220/0005923703480351