SUSTAINABILITY OF HADOOP CLUSTERS
Luis Bautista, Alain April
2011
Abstract
Hadoop is a set of utilities and frameworks for the development and storage of distributed applications in cloud computing, the core component of which is the Hadoop Distributed File System (HDFS). NameNode is a key element of its architecture, and also its “single point of failure”. To address this issue, we propose a replication mechanism that will protect the NameNode data in case of failure. The proposed solution involves two distinct components: the creation of a BackupNode cluster that will use a leader election function to replace the NameNode, and a mechanism to replicate and synchronize the file system namespace that is used as a recovery point.
References
- Apache Hadoop, 2010. http://hadoop.apache.org/
- Apache Software Foundation, 2008. Streaming Edits to a Backup Node, https://issues.apache.org/jira/browse /HADOOP-4539 .
- Apache Software Foundation, 2008. ZooKeeper Overview http://hadoop.apache.org/zookeeper/docs/current/zook eeperOver.html
- Apache Software Foundation, 2010. BooKeeper Overview.http://hadoop.apache.org/zookeeper/docs/ r3.3.0/bookkeeperOverview.html
- Carolan, G., 2009. Introduction to Cloud Computing Architecture. Sun Microsystems.
- Dhruba, B., 2008. Hadooop Distributed File System Architecture.
- Jin, H., Ibrahim, S., Bell, T., Qi, L., Cao, H., Wu, S., and Shi, X. (2010) Tools and Technologies for Building Clouds, Cloud Computing: Principles, Systems and Applications, Computer Communications and Networks, Springer-Verlag.
- Red, B., Junqueira, F. P., 2008. A Simple Totally Ordered Broadcast Protocol. In proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware (LADIS), Yorktown Heights, New York, September 15 - 17, vol. 341:2008).
- White, T., 2009. Hadoop: The Definitive Guide, OReilly Media, Inc.
- Yahoo! Inc, 2010. Managing a Hadoop Cluster, http://developer.yahoo.com/hadoop/tutorial/module7.h tml#configs .
Paper Citation
in Harvard Style
Bautista L. and April A. (2011). SUSTAINABILITY OF HADOOP CLUSTERS . In Proceedings of the 1st International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-8425-52-2, pages 587-590. DOI: 10.5220/0003332705870590
in Bibtex Style
@conference{closer11,
author={Luis Bautista and Alain April},
title={SUSTAINABILITY OF HADOOP CLUSTERS},
booktitle={Proceedings of the 1st International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2011},
pages={587-590},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003332705870590},
isbn={978-989-8425-52-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 1st International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - SUSTAINABILITY OF HADOOP CLUSTERS
SN - 978-989-8425-52-2
AU - Bautista L.
AU - April A.
PY - 2011
SP - 587
EP - 590
DO - 10.5220/0003332705870590