Economical Aspects of Database Sharding

Uwe Hohenstein, Michael C. Jaeger


Database sharding is a technique to handle large data volumes efficiently by spreading data over a large number of machines. Sharding techniques are not only integral parts of NoSQL products, but also relevant for relational database servers if applications prefer standard relational database technology and also have to scale out with massive data. Sharding of relational databases is especially useful in a public cloud because of the pay-per-use model, which already includes licenses, and the fast provisioning of virtually unlimited servers. In this paper, we investigate relational database sharding thereby focussing in detail on one of the important aspects of cloud computing: the economical aspects. We discuss the difficulties of cost savings for database sharding and present some surprising findings on how to reduce costs.


  1. Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I. and Zaharia, M. (2010): A View of Cloud Computing. CACM, 53(4), April 2010.
  2. Assuncao, M., Costanzo, A. and Buyya, R. (2009). Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters. In HPDC 7809: Proc. of 18th ACM int. symposium on High performance distributed computing, Munich, Germany, June 2009.
  3. Berriman, B., Juve, G., Deelman, E., Regelson, M. and Plavchan, P. (2010). The Application of Cloud Computing to Astronomy: A Study of Cost and Performance. 6th IEEE Int. Conf. on e-Science.
  4. Biyikoglu, C. (2011): Pricing and Billing Model for Federations in SQL Azure Explained! http:// ng-model-for-federations-in-sql-azure-explained.aspx
  5. Cattell, R. (2011): Scalable SQL and NoSQL Data Stores. ACM SIGMOD Record, Vol. 39(4).
  6. Deelman, E., Singh, G., Livny, M., Berriman, B. and Good, J. (2008). The cost of doing science on the cloud: the Montage example. In Proc. of 2008 ACM/ IEEE conf. on Supercomputing, Oregon, USA, 2008.
  7. Garfinkel, S. (2007). Commodity Grid Computing with Amazon S3 and EC2. In login 2007.
  8. Greenberg, A., Hamilton, J., Maltz, D. and Patel, P. (2009). The Cost of a Cloud: Research Problems in Data Center Networks. ACM SIGCOMM Computer Communication Review, 39, 1.
  9. Grimme, C., Lepping, J. and Papaspyrou, A. (2008). Prospects of Collaboration between Compute Providers by means of Job Interchange. In Proc. of 13th Job Scheduling Strategies for Parallel Processing, April 2008, LNCS 4942.
  10. Hamdaqa, M., Liviogiannis, L. and Tavildari, L. (2011): A Reference Model for Developing Cloud Applications. Int. Conf. on Cloud Computing and Service Science (CLOSER) 2011.
  11. Hohenstein, U., Krummenacher, R., Mittermeier, L. and Dippl, S. (2012): Choosing the Right Cloud Architecture - A Cost Perspective. CLOSER'2012.
  12. Hohenstein, U., Plesser, V., Heller, R. (1997): Evaluating the Performance of Object-Oriented Database Systems by Means of a Concrete Application. DEXA 1997.
  13. Kavis, M. (2010): NoSQL vs. RDBMS: Apples and Oranges. /nosql-vs-rdbms-apples-and-oranges.
  14. Khajeh-Hosseini, A., Sommerville, I. and Sriram, I. (2011). Research Challenges for Enterprise Cloud Computing. 1st ACM Symposium on Cloud Computing, SOCC 2010, Indianapolis.
  15. Kharchenko, M. (2012): The Art of Database Sharding. /04 /2012_369_Kharchenko_ppr.doc
  16. Klems, M., Nimis, J. and Tai, S. (2009). Do Clouds Compute? A Framework for Estimating the Value of Cloud Computing. Designing E-Business Systems. Markets, Services, and Networks, Lecture Notes in Business Information Processing, 22.
  17. Kondo, D., Javadi, B., Malecot, P., Cappello, F. and Anderson, D. P. (2009). Cost-Benefit Analysis of Cloud Computing versus Desktop Grids. In Proc. of the 2009 IEEE Int. Symp. on Parallel&Distributed Processing, May 2009.
  18. Kossmann, D., Kraska, T. and Loesing, S. (2010). An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. ACM SIGMOD 2010
  19. Louis-Rodríguez, M., Navarro, J., Arrieta-Salinas, I., Azqueta-Alzuaz, A. Sancho-Asensio, A. and Armendáriz-Iñigo, J. E.: Workload Management for Dynamic Partitioning Schemes in Replicated Databases. CLOSER'2013.
  20. Moran, B. (2010): RDBMS vs. NoSQL: And the Winner is…
  21. Microsoft (2013): Windows Azure .Net Developer Center - Best Practices.
  22. NoSQL (2013): List of NoSQL Databases.
  23. Obasanjo, D. (2009): Building Scalable Databases: Pros and Cons of Various Database Sharding Schemes. /BuildingScalableDatabasesProsAndConsOfVariousD atabaseShardingSchemes.aspx.
  24. Palankar, M., Iamnitchi, A., Ripeanu, M. and Garfinkel, S. (2008). Amazon S3 for Science Grids: A Viable Solution? In: Data-Aware Distributed Computing Workship (DADC), 2008.
  25. Walker, E. (2009). The Real Cost of a CPU Hour. Computer, 42, 4.

Paper Citation

in Harvard Style

Hohenstein U. and C. Jaeger M. (2014). Economical Aspects of Database Sharding . In Proceedings of the 4th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-019-2, pages 417-424. DOI: 10.5220/0004944604170424

in Bibtex Style

author={Uwe Hohenstein and Michael C. Jaeger},
title={Economical Aspects of Database Sharding},
booktitle={Proceedings of the 4th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},

in EndNote Style

JO - Proceedings of the 4th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - Economical Aspects of Database Sharding
SN - 978-989-758-019-2
AU - Hohenstein U.
AU - C. Jaeger M.
PY - 2014
SP - 417
EP - 424
DO - 10.5220/0004944604170424