EME: An Automated, Elastic and Efficient Prototype for Provisioning Hadoop Clusters On-demand
Feras M. Awaysheh, Tomás F. Pena, José C. Cabaleiro
2017
Abstract
Aiming at enhancing the MapReduce-based applications Quality of Service (QoS), many frameworks suggest a scale-out approach, statically adding new nodes to the cluster. Such frameworks are still expensive to acquire and does not consider the optimal usage of available resources in a dynamic manner. This paper introduces a prototype to address with this issue, by extending MapReduce resource manager with dynamic provisioning and low-cost resources capacity uplift on-demand. We propose an Enhanced Mapreduce Environment (EME), to support heterogeneous environments by extending Apache Hadoop to an opportunistically containerized environment, which enhances system throughput by adding underused resources to a local or cloud based cluster. The main architectural elements of this framework are presented, as well as the requirements, challenges, and opportunities of a first prototype.
References
- Ananthanarayanan, G., Douglas, C., Ramakrishnan, R., Rao, S., and Stoica, I. (2012). True elasticity in multitenant data-intensive compute clusters. In Proc. 3rd ACM Symposium on Cloud Computing, page 24.
- Anderson, D. P. (2004). Boinc: A system for publicresource computing and storage. In Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop on, pages 4-10. IEEE.
- Anjos, J. C., Carrera, I., Kolberg, W., Tibola, A. L., Arantes, L. B., and Geyer, C. R. (2015). MRA++: Scheduling and data placement on MapReduce for heterogeneous environments. Future Generation Computer Systems, 42:22-35.
- Apache Software (2017). http://myriad.apache.org/. 30.
- Chen, K., Powers, J., Guo, S., and Tian, F. (2014). Cresp: Towards optimal resource provisioning for MapReduce computing in public clouds. IEEE Trans. on Par. and Distributed Systems, 25(6):1403-1412.
- Chen, L., Huo, X., and Agrawal, G. (2012). Accelerating MapReduce on a coupled CPU-GPU architecture. In Proc. of the Int. Conf. on High Performance Computing, Networking, Storage and Analysis, page 25. IEEE Computer Society Press.
- Conti, M., Giordano, S., May, M., and Passarella, A. (2010). From opportunistic networks to opportunistic computing. IEEE Communications Magazine, 48(9).
- Dahiphale, D., Karve, R., Vasilakos, A. V., Liu, H., Yu, Z., Chhajer, A., Wang, J., and Wang, C. (2014). An advanced MapReduce: cloud MapReduce, enhancements and applications. IEEE Transactions on Network and Service Management, 11(1):101-115.
- Dean, J. and Ghemawat, S. (2008). Mapreduce: Simplified data processing on large clusters. Communications of the ACM, 51(1):107-113.
- Durrani, M. N. and Shamsi, J. A. (2014). Volunteer computing: requirements, challenges, and solutions. Journal of Network and Computer Applications, 39:369-380.
- Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., and Khan, S. U. (2015). The rise of Big Data on cloud computing: Review and open research issues. Information Systems, 47:98-115.
- Herodotou, H., Dong, F., and Babu, S. (2011). No one (cluster) size fits all: automatic cluster sizing for dataintensive analytics. In Proceedings of the 2nd ACM Symposium on Cloud Computing, page 18. ACM.
- Honjo, T. and Oikawa, K. (2013). Hardware acceleration of Hadoop MapReduce. In Big Data, 2013 IEEE International Conference on, pages 118-124. IEEE.
- Ji, Y., Tong, L., He, T., Tan, J., Lee, K.-w., and Zhang, L. (2013). Improving multi-job MapReduce scheduling in an opportunistic environment. In Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference on, pages 9-16. IEEE.
- Jin, H., Yang, X., Sun, X.-H., and Raicu, I. (2012). Adapt: Availability-aware MapReduce data placement for non-dedicated distributed computing. In Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on, pages 516-525. IEEE.
- Kurochkin, I. and Saevskiy, A. (2016). Boinc forks, issues and directions of development. Procedia Computer Science, 101:369-378.
- Lin, H., Ma, X., Archuleta, J., Feng, W.-c., Gardner, M., and Zhang, Z. (2010). Moon: MapReduce on opportunistic environments. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pages 95-106. ACM.
- Moca, M., Silaghi, G. C., and Fedak, G. (2011). Distributed results checking for MapReduce in volunteer computing. In Parallel and distributed processing workshops and Phd Forum (IPDPSW), 2011 IEEE international symposium on, pages 1847-1854. IEEE.
- Nghiem, P. P. and Figueira, S. M. (2016). Towards efficient resource provisioning in MapReduce. Journal of Parallel and Distributed Computing, 95:29-41.
- Thain, D., Tannenbaum, T., and Livny, M. (2005). Distributed computing in practice: the Condor experience. Concurrency and computation: practice and experience, 17(2-4):323-356.
- Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R. H., and Stoica, I. (2008). Improving MapReduce performance in heterogeneous environments. In Osdi, volume 8, page 7.
Paper Citation
in Harvard Style
Awaysheh F., Pena T. and Cabaleiro J. (2017). EME: An Automated, Elastic and Efficient Prototype for Provisioning Hadoop Clusters On-demand . In Proceedings of the 7th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-243-1, pages 737-742. DOI: 10.5220/0006379607370742
in Bibtex Style
@conference{closer17,
author={Feras M. Awaysheh and Tomás F. Pena and José C. Cabaleiro},
title={EME: An Automated, Elastic and Efficient Prototype for Provisioning Hadoop Clusters On-demand},
booktitle={Proceedings of the 7th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2017},
pages={737-742},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006379607370742},
isbn={978-989-758-243-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - EME: An Automated, Elastic and Efficient Prototype for Provisioning Hadoop Clusters On-demand
SN - 978-989-758-243-1
AU - Awaysheh F.
AU - Pena T.
AU - Cabaleiro J.
PY - 2017
SP - 737
EP - 742
DO - 10.5220/0006379607370742