
Horizontal (self-)tuning 
Horizontal (self-)tuning 
Vertical (self-)tuning 
Application  Application 
Workload  Workload Requirements 
. . . 
Horizontal (self-)tuning 
Horizontal (self-)tuning 
Vertical (self-)tuning 
Application  Application 
Workload  Workload 
. . . 
Figure 3: Clustering applications with similar requirements and workloads(divide and conquer) to ease tuning.
Clustering based on Physical Resources. Finding
the right cluster size, deciding when to scale, and
minimizing the system overhead that results from
adding new nodes, should be addressed. All CDM
solutions support scalability to satisfy applications
growth. However, some systems, such as Cassan-
dra and Yahoo! Pnuts, show degradation in perfor-
mance and need time to stabilize after adding new
nodes (Cooper et al., 2010). For this reason, and
for cost and energy conserving, increasing cluster
size should not always be the first suggested solution
for performance problems. Since the assumption of
homogeneous clusters does not stand in real appli-
cations, CDM is evolving towards adopting hetero-
geneity. In addition, systems supporting heterogene-
ity allow the possibility of improving performance by
adding nodes with higher capacity instead of having
to upgrade, at once, all nodes within a cluster (De-
Candia et al., 2007). However, only with resource-
awarescheduling and load balancingperformance im-
proves. Various studies (Rasool and Down, 2012; Ah-
mad et al., 2012) show that the current implementa-
tions of data-intensive applications do not take into
consideration heterogeneous nodes and show degra-
dation in performance.
Alternatives for Cluster Structures and Cluster-
ing Strategies. While the outlined proposal of log-
ical clusters is already supported in a static manner,
with our research we want to focus on dynamic clus-
tering to support workload-based optimization. This
dynamic clustering requires support for splitting and
merging or re-computing clusters. Furthermore, an
LC may contain other LCs, so hierarchical structures
are desirable for the clustering process as well as the
mapping to clustering criteria. Furthermore, the pos-
sibility to build clusters only on several layers or inde-
pendent clusters across layers should be considered.
4 CONCLUSIONS
In this paper,we outlined the tuning tradeoffdecisions
and optimization goals for cloud data management
systems. The complexity of (self-)tuning for these
systems results from their highly distributed multi-
layerd architecture. (Self-)Tuning gets even more
complicated when one cloud database cluster is serv-
ing one application with shifting workloads or sev-
eral applications with multiple workloads. With the
aim of supporting (self-)tuning in such case, we sug-
gested a general model for creating logical clusters
within a cloud DB system. To create logical clus-
ters, we depend on clustering of applications based
on data, workload, optimization goals and thresholds.
Finally, we briefly discussed different problems and
alternatives for this model.
REFERENCES
Ahmad, F., Chakradhar, S. T., Raghunathan, A., and
Vijaykumar, T. N. (2012). Tarazu: Optimizing
MapReduce On Heterogeneous Clusters. SIGARCH,
40(1):61–74.
Bostoen, T., Mullender, S., and Berbers, Y. (2012). Analy-
sis of disk power management for data-center storage
systems. In e-Energy, pages 2:1–2:10. ACM.
Capriolo, E. (2011). Cassandra High Performance Cook-
book. Packt Publishing.
Chen, Y., Ganapathi, A. S., Griffith, R., and Katz, R. H.
(2010). Towards understanding cloud performance
tradeoffs using statistical workload analysis and re-
play. Technical report, EECS Department, University
of California, Berkeley.
Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R.,
and Sears, R. (2010). Benchmarking Cloud Serving
Systems with YCSB. In SoCC, pages 143–154. ACM.
Dahbur, K., Mohammad, B., and Tarakji, A. B. (2011). A
survey of risks, threats and vulnerabilities in cloud
ClusteringtheCloud-AModelfor(Self-)TuningofCloudDataManagementSystems
523