Towards High Performance Big Data Processing by Making Use of Non-volatile Memory
Shuichi Oikawa
2015
Abstract
Cloud computing environments for big data processing require high performance storage. There are emerging high performance memory storage technologies, such as next generation non-volatile (NV) memory and battery backed NV-DIMM. While their performance is much higher than the current block storage devices, such as SSDs and HDDs, they provides only limited capacity. Such limited capacity makes it difficult for memory storage to be adapted as mass storage, and their uses in cloud computing environments have been severely limited. This paper proposes a method that combines memory storage with block storage. It makes use of memory storage as cache of block storage in order to remove the capacity limitation of memory storage. The proposed method inherits the high performance of memory storage and also the large capacity of block storage. Therefore, memory storage can be transparently used as a part of mass storage while its low overhead access can accelerate storage performance. The proposed method was implemented as a device driver of the Linux kernel. Its performance evaluation shows that it outperforms a bare SSD drive and achieves better performance on the Hadoop and database environments.
References
- Facebook (2014). Flashcache. https://github.com/facebook/ flashcache.
- Hu, Y. and Yang, Q. (1996). Dcd - disk caching disk: A new approach for boosting i/o performance. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 169-178.
- Kgil, T. and Mudge, T. (2006). Flashcache: A nand flash memory file cache for low power web servers. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, CASES 7806, pages 103-112, New York, NY, USA. ACM.
- Koller, R., Marmol, L., Rangaswami, R., Sundararaman, S., Talagala, N., and Zhao, M. (2013). Write policies for host-side flash caches. In Proceedings of the 11th USENIX Conference on File and Storage Technologies, FAST'13, pages 45-58, Berkeley, CA, USA. USENIX Association.
- Saxena, M., Swift, M. M., and Zhang, Y. (2012). Flashtier: A lightweight, consistent and durable storage cache. In Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys 7812, pages 267-280, New York, NY, USA. ACM.
- Shvachko, K., Kuang, H., Radia, S., and Chansler, R. (2010). The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST 7810, pages 1-10, Washington, DC, USA. IEEE Computer Society.
- Ueda, K., Nomura, J., and Christie, M. (2007). Requestbased device-mapper multipath and dynamic load balancing. In Proceedings of the Linux Symposium, volume 2, pages 235-243.
Paper Citation
in Harvard Style
Oikawa S. (2015). Towards High Performance Big Data Processing by Making Use of Non-volatile Memory . In Proceedings of the 5th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-104-5, pages 529-534. DOI: 10.5220/0005463605290534
in Bibtex Style
@conference{closer15,
author={Shuichi Oikawa},
title={Towards High Performance Big Data Processing by Making Use of Non-volatile Memory},
booktitle={Proceedings of the 5th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2015},
pages={529-534},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005463605290534},
isbn={978-989-758-104-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 5th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - Towards High Performance Big Data Processing by Making Use of Non-volatile Memory
SN - 978-989-758-104-5
AU - Oikawa S.
PY - 2015
SP - 529
EP - 534
DO - 10.5220/0005463605290534