Authors:
Aymen Jlassi
1
and
Patrick Martineau
2
Affiliations:
1
Université François-Rabelais de Tours and Groupe Cyrès, France
;
2
Université François-Rabelais de Tours, France
Keyword(s):
Cloud Computing, Virtualization, Green Consumption, Docker Container, Hadoop, Resources Consumption.
Related
Ontology
Subjects/Areas/Topics:
Cloud Computing
;
Cloud Computing Enabling Technology
;
Cloud Resource Virtualization and Composition
;
Performance Development and Management
;
Virtualization Technologies
Abstract:
Virtual technologies have proven their capabilities to ensure good performance in the context of high performance computing (HPC). During the last decade, the big data tools have been emerging, they have their own needs in performance and infrastructure. Having a wide breadth of experience in the HPC domain, the experts can evaluate the infrastructures used to run big data tools easily. The outcome of this paper is the evaluation of two technologies of virtualization in the context of big data tools. We compare the performance and the energy consumption of two technologies of virtualization (Docker containers and VMware) and benchmark the software Hadoop (JoshBaer, 2015) using these environments. Firstly, the aim is the reduction of the Hadoop deployment cost using the cloud. Secondly, we discuss and analyze the assumptions learned from the HPC experiments and their applicability in the big data context. Thirdly, the Hadoop community finds an in-depth study of the resource consumptio
n depending on the deployment environment. We come to the point that the use of the Docker container gives better performance in most experiments. Besides, the energy consumption varies according to the executed workload.
(More)