lem on the Hadoop cluster. We take into account the
performances and the energy consumption in a bicri-
teria problem.
ACKNOWLEDGEMENTS
This work was sponsored in part by the CYRES
GROUP in France and French National Research
Agency under the grant CIFRE n
o
2012/1403.
REFERENCES
Agency, U. E. P. (2007). Report to congress on server and
data center energy efficiency.
cores (2015). Getting started with systemd. https://
coreos.com/docs/launching-containers/launching/
getting-started-with-systemd/.
Council, N. R. D. (2014). American data centers are wast-
ing huge amounts of energy.
David, M. (2007). Understanding full virtualization, par-
avirtualization and hardware assist. White Paper.
Dean, J. and Ghemawat, S. (2008). Mapreduce: Simpli-
fied data processing on large clusters. Commun. ACM,
51(1):107–113.
Devices, A. M. (2012). Hadoop performance tuning guide
- amd.
Fadika, Z., Dede, E., Govindaraju, M., and Ramakrishnan,
L. (2011). Benchmarking mapreduce implementa-
tions for application usage scenarios. In Grid 2011,
pages 90–97.
Fadika, Z., Govindaraju, M., Canon, R., and Ramakrish-
nan, L. (2012). Evaluating hadoop for data-intensive
scientific operations. IEEE CLOUD 2012.
for Energy, I. and (IET), T. (2015). Data centres energy
efficiency. http://iet.jrc.ec.europa.eu/energyefficiency/
ict-codes-conduct/data-centres-energy-efficiency.
Gandomi, A. and Haider, M. (2015). Beyond the hype: Big
data concepts, methods, and analytics. International
Journal of Information Management, 35(2):137 – 144.
Gantikow, H., Klingberg, S., and Reich, C. (2015).
Container-based virtualization for hpc. In CLOSER
2015, pages 543–551.
Gomes Xavier, M., Veiga Neves, M., and Fonticielha de
Rose, C. (2014). A performance comparison of
container-based virtualization systems for mapreduce
clusters. In 22nd Euromicro International Conference
on Parallel, Distributed and Network-Based Process-
ing (PDP).
Gu, Y. and Grossman, R. L. (2009). Lessons learned from
a year’s worth of benchmarks of large data clouds.
MTAGS ’09, pages 31–36. ACM.
Huang, S., Huang, J., Dai, J., Xie, T., and Huang, B. (2010).
The hibench benchmark suite: Characterization of the
mapreduce-based data analysis. In Data Engineering
Workshops (ICDEW), 2010, pages 41–51.
Intel (2015). Intel virtualization technology (intel vt). http://
www.intel.com/content/www/us/en/virtualization/
virtualization-technology/intel-virtualization-
technology.html.
Jiang, D., Ooi, B. C., Shi, L., and Wu, S. (2010). The perfor-
mance of mapreduce: An in-depth study. Proc. VLDB
Endow., 3(1-2):472–483.
Jlassi, A., Martineau, P., and Tkindt, V. (2015). Offline
scheduling of map and reduce tasks on hadoop sys-
tems. In CLOSER 2015, pages 178–185.
JoshBaer (2015). Hadoop wiki powerby. https://wiki.
apache.org/hadoop/PoweredBy.
Kontagora, M. and Gonzalez-Velez, H. (2010). Benchmark-
ing a mapreduce environment on a full virtualisation
platform. In CISIS 2010, pages 433–438.
Pavlo, A., Paulson, E., Rasin, A., Abadi, D. J., DeWitt,
D. J., Madden, S., and Stonebraker, M. (2009a). A
comparison of approaches to large-scale data analy-
sis. SIGMOD ’09, pages 165–178, New York, NY,
USA.
Pavlo, A., Paulson, E., Rasin, A., Abadi, D. J., DeWitt,
D. J., Madden, S., and Stonebraker, M. (2009b). A
comparison of approaches to large-scale data analysis.
SIGMOD ’09, pages 165–178.
Peinl, R. and Holzschuher, F. (2015). The docker ecosystem
needs consolidation. In CLOSER 2015, pages 535–
542.
Reshetova, E., Karhunen, J., Nyman, T., and Asokan, N.
(2014). Security of OS-level virtualization technolo-
gies: Technical report. ArXiv e-prints.
Shafer, J., Rixner, S., and Cox, A. (2010). The hadoop dis-
tributed filesystem: Balancing portability and perfor-
mance. In ISPASS 2010, pages 122–133.
Stonebraker, M., Abadi, D., DeWitt, D. J., Madden, S.,
Paulson, E., Pavlo, A., and Rasin, A. (2010). Mapre-
duce and parallel dbmss: Friends or foes? Commun.
ACM, 53(1):64–71.
Wen, Y., Zhao, J., Zhao, G., Chen, H., and Wang, D. (2012).
A survey of virtualization technologies focusing on
untrusted code execution. In IMIS’12, pages 378–383.
Xavier, M., Neves, M., Rossi, F., Ferreto, T., Lange, T.,
and De Rose, C. (2013). Performance evaluation of
container-based virtualization for high performance
computing environments. In PDP, 2013, pages 233–
240.
Xu, G., Xu, F., and Ma, H. (2012). Deploying and research-
ing hadoop in virtual machines. In ICAL ’12, pages
395–399.
Benchmarking Hadoop Performance in the Cloud - An in Depth Study of Resource Management and Energy Consumption
201