Figure 3: Two-level thresholds.
only indicate the resource under-utilization or when
a SLO violation occurs. Thus, to prevent in advance
these performance degradations, additional bound-
aries have to be defined. This mechanism establishes
two levels of threshold ranges for each metric (CPU
and response time) based on the bounds defined in the
SLO by the user.
In Figure 3, these two extra thresholds called
predictive and reactive, both with upper and lower
bounds, create two ”head-rooms” between the SLO
threshold and them. The predictive head-room H
1
is
intended to alert of future workload alterations when
a metric exceeds its predictive bounds, thus increas-
ing proportionally to its weight the chances to trigger
scaling actions (denoted by s bck and s out in Algo-
rithm 2 lines 12-17). Otherwise, the scaling chances
will drop in the same proportion (Line 20). The re-
active head-room H
2
is used to trigger scaling actions
if the workload presents a increasing or decreasing
variation (Lines 24-26), which may cause SLO vio-
lations if it keeps such a trend. This mechanism in
conjunction with the workload trend estimation allow
to better analyze the evolution of performance fluctu-
ations, and as a consequence improves the accuracy
of our decisions by reacting in advance. In the future,
these two-levels of thresholds could be adjusted de-
pending on the hardware configuration of each provi-
sioned VM, as introduced in (Beloglazov and Buyya,
2010). To sum up, the feedback algorithm triggers a
scaling action when a series of conditions are satis-
fied: (i) no previous scaling actions have been taken
over the last 15min; (ii) the recent monitoring data
have to exceed the predictive and reactive threshold
ranges; (iii) the workload trend has to follow a con-
stant pattern (increasing/decreasing). Although the
combination of these techniques improves the accu-
racy of our measurements, the heterogeneous nature
of the VM instances requires more flexible provision-
ing algorithms, as pointed out in (Jiang, 2012).
4.3 Dynamic Load Balancing Weights
The problem we consider here is the heterogeneity of
cloud platforms. Different VMs have different perfor-
mance characteristics, even when their specifications
from the cloud vendor are the same. This issue can be
addressed through various load balancing techniques,
like assigning weights to the backend servers or tak-
ing into account the current number of connections
that each server handles. Furthermore, the perfor-
mance behavior of the virtual servers may also fluc-
tuate, either due to changes in the application’s usage
patterns, or due to changes related to the hosting of
the virtual servers (e.g., VM migration).
In order to address these issues in ConPaaS we
implemented a weighted load balancing system in
which the weights of the servers are periodically re-
adjusted automatically, based on the monitoring data
(e.g. response time, request rate and CPU usage).
This method assigns the same weight to each back-
end server at the beginning of the process. The
weights are then periodically adjusted (in our exper-
iments, every 15min) proportionally with the differ-
ence among the average response times of the servers
during this time interval. By adding this technique
to the feedback-based algorithm, we noticed a perfor-
mance improvement when running the benchmarks.
5 EVALUATION
To compare the provisioning algorithms described
above, we ran experiments on two infrastructures: a
homogeneous one (the DAS-4, a multi-cluster system
hosted by universities in The Netherlands (Advanced
School for Computing and Imaging (ASCI), )) and a
heterogeneous one (Amazon Elastic Compute Cloud
(EC2), ). The goal of our experiments was to compare
the algorithms by how well they fulfill the SLOs and
by the amount of resources they allocate.
Testbed Configuration: As a representative sce-
nario, we deployed the MediaWiki application using
ConPaaS on both infrastructures, and we ran the Wik-
ibench tools with a 10% sample of a real Wikipedia
access trace for 24hours. We configured the exper-
iments as follows: a monitoring window of 5min, a
SLO of 700ms at the service’s side (denoted by a red
Line in Figures) and the same statistically-chosen per-
formance threshold ranges for response time and CPU
utilization. Note that, the weighted load-balancing
provisioning technique was only evaluated on the het-
erogeneous platform Amazon EC2, as it only brings
improvements in environments where VMs may have
different hardware configurations.
CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience
356