Yang Li, Li Guo, Yike Guo


The advent of the cloud era has yielded new ways of storing, accessing and managing data. Cloud storage services enable the storage of data in an inexpensive, secure, fast, reliable and highly scalable manner over the internet. Although giant providers such as Amazon and Google have made a great success of their services, many enterprises and scientists are still unable to make the transition into the cloud environment due to often insurmountable issues of privacy, data protection and vendor lock-in. These issues demand that it be possible for anyone to setup or to build their own storage solutions that are independent of commercially available services. However, the question persists as to how to provide an effective cloud storage service with regards to system architecture, resource management mechanisms, data reliability and durability, as well as to provide proper pricing models. The aim of this research is to present an in-depth understanding and analysis of the key features of generic cloud storage services, and of how such services should be constructed and provided. This is achieved through the demonstration of design rationales and the implementation details of a real cloud storage system (CACSS). The method by which different technologies can be combined to provide a single excellent performance, highly scalable and reliable cloud storage system is also detailed. This research serves as a knowledge source for inexperienced cloud providers, giving them the capability of swiftly setting up their own cloud storage services.


  1. Abe, Y. & Gibson, G. pWalrus: Towards better integration of parallel file systems into cloud storage. 2010. IEEE, 1-7.
  2. Amazon. Amazon Simple Storage Service (S3) [Online]. Available:
  3. Amazon. Route 53 [Online]. Available: http://
  4. Barr, J. 2011. Available from: aws/2011/10/amazon-s3-566-billion-objects-370000- requestssecond-and-hiring.html.
  5. Beaver, D., Kumar, S., Li, H. C., Sobel, J. & Vajgel, P. 2010. Finding a needle in Haystack: Facebook's photo storage. Proc. 9th USENIX OSDI.
  6. Borthakur, D. 2007. The hadoop distributed file system: Architecture and design. Hadoop Project Website.
  7. Borthakur, D. 2010. Hadoop avatarnode high availability. Available from: 2010/02/hadoop-namenode-high-availability.html.
  8. Bresnahan, J., Keahey, K., Freeman, T. & Labissoniere, D. 2010. Cumulus: an open source storage cloud for science. SC10 Poster.
  9. Carns, P., Lang, S., Ross, R., Vilayannur, M., Kunkel, J. & Ludwig, T. Small-file access in parallel file systems. 2009. IEEE, 1-11.
  10. Carns, P. H., Ligon III, W. B., Ross, R. B. & Thakur, R. PVFS: A parallel file system for Linux clusters. 2000. USENIX Association, 28-28.
  11. Carstoiu, D., Cernian, A. & Olteanu, A. Hadoop Hbase0.20.2 performance evaluation. New Trends in Information Science and Service Science (NISS), 2010 4th International Conference on, 11-13 May 2010 2010. 84-87.
  12. Dean, J. & Ghemawat, S. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, 51, 107-113.
  13. Doclo, L. 2011. Clustering Tomcat Servers with High Availability and Disaster Fallback. Available from:
  14. Garfinkel, S. L. An evaluation of amazon's grid computing services: EC2, S3, and SQS. 2007. Citeseer.
  15. Gibson, G. A. & van Meter, R. 2000. Network attached storage architecture. Communications of the ACM, 43, 37-45.
  16. Guo, Y.-K. & Guo, L. 2011. IC cloud: Enabling compositional cloud. International Journal of Automation and Computing, 8, 269-279.
  17. JETS3T. JetS3t [Online]. Available: http://
  18. Khetrapal, A. & Ganesh, V. 2006. HBase and Hypertable for large scale distributed storage systems. Dept. of Computer Science, Purdue University.
  19. Kumar, V. 2002. Introduction to parallel computing, Addison-Wesley Longman Publishing Co., Inc.
  20. Leung, A. W., Shao, M., Bisson, T., Pasupathy, S. & Miller, E. L. 2009. Spyglass: fast, scalable metadata search for large-scale storage systems. Proccedings of the 7th conference on File and storage technologies. San Francisco, California: USENIX Association.
  21. Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L. & Zagorodnov, D. The eucalyptus open-source cloud-computing system. 2009. IEEE, 124-131.
  22. Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D. & Lyon, B. Design and implementation of the Sun network filesystem. 1985. 119-130.
  23. Schwan, P. Lustre: Building a file system for 1000-node clusters. 2003.
  24. Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Manohar, M., Patil, S. & Pearlman, L. A metadata catalog service for data intensive applications. 2003. IEEE, 33-33.

Paper Citation

in Harvard Style

Li Y., Guo L. and Guo Y. (2012). CACSS: TOWARDS A GENERIC CLOUD STORAGE SERVICE . In Proceedings of the 2nd International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-8565-05-1, pages 27-36. DOI: 10.5220/0003910800270036

in Bibtex Style

author={Yang Li and Li Guo and Yike Guo},
booktitle={Proceedings of the 2nd International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},

in EndNote Style

JO - Proceedings of the 2nd International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
SN - 978-989-8565-05-1
AU - Li Y.
AU - Guo L.
AU - Guo Y.
PY - 2012
SP - 27
EP - 36
DO - 10.5220/0003910800270036