new database servers in short time.
In this paper, we investigated database sharding
of relational database systems (RDBS) in the cloud
from the perspective of cost and performance. We
obtained some surprising results.
At first, splitting shards into two equally sized
shards is not always advantageous from a cost
perspective. Other split factors such as 80/20%,
combined with a merge operation, yield better
results in our scenarios. Anyway, we demonstrated
that achieving optimal costs is difficult in general.
Furthermore, performance measurements show
that parallelizing queries to several shards is not
always better than querying a single database of the
same total size.
In the future, we intend to further elaborate on
strategies to split optimally according to incoming
load. In particular, cost/performance considerations
require further investigations. And finally, we want
to apply our ideas to multi-tenancy.
REFERENCES
Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R.,
Konwinski, A., Lee, G., Patterson, D., Rabkin, A.,
Stoica, I. and Zaharia, M. (2010): A View of Cloud
Computing. CACM, 53(4), April 2010.
Assuncao, M., Costanzo, A. and Buyya, R. (2009).
Evaluating the cost-benefit of using cloud computing
to extend the capacity of clusters. In HPDC '09: Proc.
of 18th ACM int. symposium on High performance
distributed computing, Munich, Germany, June 2009.
Berriman, B., Juve, G., Deelman, E., Regelson, M. and
Plavchan, P. (2010). The Application of Cloud
Computing to Astronomy: A Study of Cost and
Performance. 6th IEEE Int. Conf. on e-Science.
Biyikoglu, C. (2011): Pricing and Billing Model for
Federations in SQL Azure Explained! http://
blogs.msdn.com/b/cbiyikoglu/archive/2011/12/12/billi
ng-model-for-federations-in-sql-azure-explained.aspx
Cattell, R. (2011): Scalable SQL and NoSQL Data Stores.
ACM SIGMOD Record, Vol. 39(4).
Deelman, E., Singh, G., Livny, M., Berriman, B. and
Good, J. (2008). The cost of doing science on the
cloud: the Montage example. In Proc. of 2008 ACM/
IEEE conf. on Supercomputing, Oregon, USA, 2008.
Garfinkel, S. (2007). Commodity Grid Computing with
Amazon S3 and EC2. In login 2007.
Greenberg, A., Hamilton, J., Maltz, D. and Patel, P.
(2009). The Cost of a Cloud: Research Problems in
Data Center Networks. ACM SIGCOMM Computer
Communication Review, 39, 1.
Grimme, C., Lepping, J. and Papaspyrou, A. (2008).
Prospects of Collaboration between Compute
Providers by means of Job Interchange. In Proc. of
13th Job Scheduling Strategies for Parallel
Processing, April 2008, LNCS 4942.
Hamdaqa, M., Liviogiannis, L. and Tavildari, L. (2011): A
Reference Model for Developing Cloud Applications.
Int. Conf. on Cloud Computing and Service Science
(CLOSER) 2011.
Hohenstein, U., Krummenacher, R., Mittermeier, L. and
Dippl, S. (2012): Choosing the Right Cloud
Architecture - A Cost Perspective. CLOSER’2012.
Hohenstein, U., Plesser, V., Heller, R. (1997): Evaluating
the Performance of Object-Oriented Database Systems
by Means of a Concrete Application. DEXA 1997.
Kavis, M. (2010): NoSQL vs. RDBMS: Apples and
Oranges. http://www.kavistechnology.com/blog
/nosql-vs-rdbms-apples-and-oranges.
Khajeh-Hosseini, A., Sommerville, I. and Sriram, I.
(2011). Research Challenges for Enterprise Cloud
Computing. 1st ACM Symposium on Cloud
Computing, SOCC 2010, Indianapolis.
Kharchenko, M. (2012): The Art of Database Sharding.
http://intermediatesql.com/wp-content/uploads/2012
/04 /2012_369_Kharchenko_ppr.doc
Klems, M., Nimis, J. and Tai, S. (2009). Do Clouds
Compute? A Framework for Estimating the Value of
Cloud Computing. Designing E-Business Systems.
Markets, Services, and Networks, Lecture Notes in
Business Information Processing, 22.
Kondo, D., Javadi, B., Malecot, P., Cappello, F. and
Anderson, D. P. (2009). Cost-Benefit Analysis of
Cloud Computing versus Desktop Grids. In Proc. of
the 2009 IEEE Int. Symp. on Parallel&Distributed
Processing, May 2009.
Kossmann, D., Kraska, T. and Loesing, S. (2010). An
Evaluation of Alternative Architectures for Trans-
action Processing in the Cloud. ACM SIGMOD 2010
Louis-Rodríguez, M., Navarro, J., Arrieta-Salinas, I.,
Azqueta-Alzuaz, A. Sancho-Asensio, A. and
Armendáriz-Iñigo, J. E.: Workload Management for
Dynamic Partitioning Schemes in Replicated
Databases. CLOSER’2013.
Moran, B. (2010): RDBMS vs. NoSQL: And the Winner
is… http://sqlmag.com/sql-server/rdbms-vs-nosql-and-
winner.
Microsoft (2013): Windows Azure .Net Developer Center -
Best Practices. http://www.windowsazure.com/en-
us/develop/net/best-practices.
NoSQL (2013): List of NoSQL Databases. http://nosql-
database.org
Obasanjo, D. (2009): Building Scalable Databases: Pros
and Cons of Various Database Sharding Schemes.
http://www.25hoursaday.com/weblog/2009/01/16
/BuildingScalableDatabasesProsAndConsOfVariousD
atabaseShardingSchemes.aspx.
Palankar, M., Iamnitchi, A., Ripeanu, M. and Garfinkel, S.
(2008). Amazon S3 for Science Grids: A Viable
Solution? In: Data-Aware Distributed Computing
Workship (DADC), 2008.
Walker, E. (2009). The Real Cost of a CPU Hour.
Computer, 42, 4.
CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience
424