Gray-Box Models for Performance Assessment of Spark Applications

Marco Lattuada, Eugenio Gianniti, Marjan Hosseini, Danilo Ardagna, Alexandre Maros, Fabricio Murai, Ana Couto da Silva, Jussara Almeida

Abstract

Big data applications are among the most suitable applications to be executed on cluster resources because of their high requirements of computational power and data storage. Correctly sizing the resources devoted to their execution does not guarantee they will be executed as expected. Nevertheless, their execution can be affected by perturbations which can change the expected execution time. Identifying when these types of issue occurred by comparing their actual execution time with the expected one is mandatory to identify potentially critical situations and to take the appropriate steps to prevent them. To fulfill this objective, accurate estimates are necessary. In this paper, machine learning techniques coupled with a posteriori knowledge are exploited to build performance estimation models. Experimental results show how the models built with the proposed approach are able to outperform a reference state-of-the-art method (i.e., Ernest method), reducing in some scenarios the error from the 221.09-167.07% to 13.15-30.58%.

Download


Paper Citation


in Harvard Style

Lattuada M., Gianniti E., Hosseini M., Ardagna D., Maros A., Murai F., Couto da Silva A. and Almeida J. (2019). Gray-Box Models for Performance Assessment of Spark Applications.In Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: IWFCC, ISBN 978-989-758-365-0, pages 609-618. DOI: 10.5220/0007877806090618


in Bibtex Style

@conference{iwfcc19,
author={Marco Lattuada and Eugenio Gianniti and Marjan Hosseini and Danilo Ardagna and Alexandre Maros and Fabricio Murai and Ana Couto da Silva and Jussara Almeida},
title={Gray-Box Models for Performance Assessment of Spark Applications},
booktitle={Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: IWFCC,},
year={2019},
pages={609-618},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007877806090618},
isbn={978-989-758-365-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: IWFCC,
TI - Gray-Box Models for Performance Assessment of Spark Applications
SN - 978-989-758-365-0
AU - Lattuada M.
AU - Gianniti E.
AU - Hosseini M.
AU - Ardagna D.
AU - Maros A.
AU - Murai F.
AU - Couto da Silva A.
AU - Almeida J.
PY - 2019
SP - 609
EP - 618
DO - 10.5220/0007877806090618