Task Placement in a Cloud with Case-based Reasoning
Eric Schulte-Zurhausen and Mirjam Minor
Institute of Informatik, Goethe University, Robert-Mayer-Str.10, Frankfurt am Main, Germany
Keywords:
Workflow, Cloud Management, Task Placement.
Abstract:
Moving workflow management to the cloud raises novel, exciting opportunities for rapid scalability of work-
flow execution. Instead of running a fixed number of workflow engines on an invariant cluster of physical
machines, both physical and virtual resources can be scaled rapidly. Furthermore, the actual state of the re-
sources gained from cloud monitoring tools can be used to schedule workload, migrate workload or conduct
split and join operations for workload at run time. However, having so many options for distributing workload
forms a computationally complex configuration problem which we call the task placement problem.
In this paper, we present a case-based framework addressing the task placement problem by interleaving
workflow management and cloud management. In addition to traditional workflow and cloud management
operations it provides a set of task internal operations for workload distribution.
1 INTRODUCTION
Rapid scalability is one of the most important char-
acteristics of cloud computing according to the NIST
definition (US Department of Commerce, 2011). In
classical workflow management, scalability of work-
flow execution is achieved by traditional load balanc-
ing. A load balancer distributes the workload across
multiple workflow engines in order to maintain an
acceptable performance level of the workflow man-
agement system (WFMS) (Jin et al., 2001). Mov-
ing workflow management to the cloud raises novel,
exciting opportunities for rapid scalability of work-
flow execution. Instead of running a fixed number
of workflow engines on an invariant cluster of phys-
ical machines, both physical and virtual resources
can be scaled rapidly. Furthermore, the actual state
of the resources gained from cloud monitoring tools
can be used to schedule workload, migrate workload
or conduct split and join operations for workload at
run time. However, having so many options for dis-
tributing workload forms a computationally complex
configuration problem which we call the task place-
ment problem. The Workflow Management Coalition
({Workflow Management Coalition}, 1999) defines a
workflow as followed: The automation of a business
process, in whole or part, during which documents,
information or tasks are passed from one participant
to another for action, according to a set of procedural
rules.
A task also called activity is defined as: A description
of a piece of work that forms one logical step within a
process. An activity may be a manual activity, which
does not support computer automation, or a workflow
(automated) activity. A workflow activity requires hu-
man and/or machine resources(s) to support process
execution.
In this work we use the term task configuration. A
task configuration is a set of tasks with their relation-
ships and also a set of parameter. These parameters
are not needed for the tasks themselves but for the
handling of the tasks through the system.
A cloud configuration is in our work a set of PMs with
their resources and a set of VMs assigned to the PMs
and sharing some or all of the resources owned by the
PM.
We define a task placement as the assignment of au-
tomated tasks to virtual machines (VM) and further-
more the assignment of VMs to physical machines
(PM). So a task placement is a mapping of a task con-
figuration to a cloud configuration. Our task place-
ment problem includes two kinds of problems: the job
placement problem as in (Sharma et al., 2011) and a
VM placement problem as in (Jiang et al., 2012). The
job placement problem is the problem to assign jobs
to VMs and the VM placement problem is the prob-
lem to assign VMs to PMs.
In this paper we present a case-based framework ad-
dressing the task placement problem by interleaving
workflow management and cloud management. In
323
Schulte-Zurhausen E. and Minor M..
Task Placement in a Cloud with Case-based Reasoning.
DOI: 10.5220/0004944203230328
In Proceedings of the 4th International Conference on Cloud Computing and Services Science (CLOSER-2014), pages 323-328
ISBN: 978-989-758-019-2
Copyright
c
2014 SCITEPRESS (Science and Technology Publications, Lda.)
addition to traditional workflow and cloud manage-
ment operations it provides a set of task internal oper-
ations for workload distribution. The notion of a case
is used to represent a task placement problem. A stack
of placement operators allows to organize the case
representation in layers. Instead of solving the task
placement in a computationally complex optimization
problem, case-based reasoning (Aamodt and Plaza,
1994) is used to retrieve similar task placements from
the past that can be reused in the current situation.
In Section 2, we discuss related work from workflow
management and cloud management. Section 3 re-
sumes and extends a layered VM placement model
from Maurer et al. (Maurer et al., 2013) that we use
for a task placement model in Section 4. Section 5
describes a case-based recommender system that sug-
gests a set of promising placement operators based
on the experience stored in the case base. Section 6
addresses a planned evaluation. Finally, we discuss
future work and draw some conclusions in Section 7.
2 RELATED WORK
Our definition of a task placement is different form
the definition of a task placement used in Sharma et
al. (Sharma et al., 2011). In their work the tasks
are jobs placed on VMs, but the jobs are independent
from each other. Workflow tasks are ordered and their
execution order is strict. In their work they describe
a task placement constraint. Not all tasks or jobs can
be placed on all available VMs due to kernel, operat-
ing system or other restrictions. The task placement
constraint restricts the task placement.
Tang et al. describe in their work (Tang et al.,
2007) the problem to allocate applications with dy-
namically changing demands to VMs and how to de-
cide how many VMs must be started. They called this
problem application placement and their problem is
similar to Sharma et al. (Sharma et al., 2011).
The problem to assign workflow tasks to VMs
is not new. Wu et al. describe in their work (Wu
et al., 2013) their approach to solve the task placement
problem for scientific workflow with meta-heuristics.
They called this Task-to-VM assignment and imple-
ment SwinDeW-C, a cloud workflow system. In their
work (Maurer et al., 2013) Maurer et al. present an
approach to manage a cloud configuration on differ-
ent layers. They divide the resource allocation into
three parts and called them escalation levels.
The first level is for individual applications. The
possible actions are:
Increase / Decrease incoming bandwidth share by
x%
Increase / Decrease outgoing bandwidth share by
x%
Increase / Decrease memory by x%
Increase / Decrease CPU share by x%
Add / Remove allocated storage by x%
Outsource (move application) to other cloud
Insource (accept application) from other cloud
Migrate application to different VM
The second level is for VMs. The possible actions
are:
Increase / Decrease incoming bandwidth share by
x%.
Increase / Decrease outgoing bandwidth share by
x%.
Increase / Decrease memory by x%.
Add / Remove allocated storage by x%.
Increase / Decrease CPU share by x%.
Outsource (move VM) to other cloud.
Insource (accept VM) from other cloud.
Migrate VM to different PM.
And the third level is for physical machines:
Add x computing nodes.
Remove x computing nodes.
Maurer et al. use the escalation levels to guarantee the
scalability and efficient resource utilization of their
system in order to fulfill the Service Level Agree-
ments. To manage the resources they used two dif-
ferent types of knowledge management - a rule-based
and a case-based reasoning approach.
3 MANIPULATION OPERATORS
In our approach we use a system based on the escala-
tion levels introduced by Maurer et al. (Maurer et al.,
2013). We extend this approach by so-called manip-
ulation operators (MO) that manipulate the task and
cloud configuration that will be introduced in Section
4. Furthermore, we add the workflow level on top
of Maurer’s escalation levels namely the application,
VM and PM levels. The resulting hierarchy of levels
is as follows:
1: Workflow level
2: Application level
3: Virtual machine level
4: Physical machine level
CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience
324
We denote the three lower levels as the resource level.
The resource level includes the application, virtual
and physical machine level. The workflow level sup-
ports operations to manipulate a workflow so that a
better utilization of the resources can be achieved.
The following workflow operations also have a hier-
archical order:
1: Task split / join
2: Data split / join
3: Migrate task to different VM
The task split operation divides a task into a bunch
of parallel tasks with different parameters. For ex-
ample: I = {image
1
,..., image
n
} is a set of images.
Two different, mutually independent types of render
algorithms r
1
(I), r
2
(I) are given. Task t
1
(r
1
(I), r
2
(I))
should perform both algorithms and task t
2
evaluates
the result. A task split means that we remove t
1
and
replace it with two tasks t
11
and t
12
, each of which per-
forms only one render algorithm t
11
(r
1
(I)), t
12
(r
2
(I)).
Because of the split it is now possible to run t
11
and
t
12
on different VMs and a duration speed up is possi-
ble. The task join operation is the inverse action. With
this operator the task t
11
and t
12
would be joined to t
1
.
A join can be useful to reduce the number of running
VMs or to reduce the number of tasks that run on a
single VM.
The data split operation replicates a task and
divides the data that is to be processed. Sim-
ilar to the previous example, I = {image
1
,... ,
image
n
} is a list of images and task r
1
(I) is an
algorithm to render images. The task t
1
is the
task to render all images t
1
(r
1
(image
1
, ..., image
n
)).
The data split operation replicates t
1
into a bunch
of parallel tasks where each task processes a sub-
set I0 I, for example t
11
(r
1
(images
1
, ...images
i
)),
t
12
(r
1
(images
i+1
, ..., images
n
)). The data join opera-
tion is the inverse action similar to the task join oper-
ation. It combines t
11
and t
12
to t
1
.
The migrate operation migrates a task between
VMs. We include this operation despite of the fact
that the migration of a task can be considered an ap-
plication migration. The border between the applica-
tion level and the workflow level is more distinct due
to this decision.
4 PLACEMENT MODEL
We model the task placement problem as a mapping
problem between a task configuration and a cloud
configuration. A task configuration is a tuple TC =
{T, P, F}. T is a set of tasks {t
1
, t
2
, ..., t
n
}. P is a set of
tuples {p
1
, p
2
, ..., p
n
} whose elements p
i
describe the
parameters for task t
i
. The parameters for each task
can be the expected execution time on a default VM
for each MBit of data volume, the expected upcoming
data volume, the system prerequisites, whether a task
split is possible or not and if a data split is possible
or not. F is a partial order on T specifying a prece-
dence relation on T. F is induced by the order of tasks
given in the workflow. In case that t
1
and t
2
are or-
dered in a sequence within the workflow, t
2
can only
be started when t
1
has finished execution and, thus,
(t
1
, t
2
) is taking part in F.
The cloud configuration CC is a tuple {PM, VM,
CCP}. PM is a set of physical machines. Each
physical machine is represented as a tuple ({incoming
bandwidth, outgoing bandwidth, CPU, memory, stor-
age, costs per time unit}). This is the representation
of the hardware, offered by the physical machine. The
costs per time unit is an abstract value which repre-
sents the costs, particularly energy, that occur when
the machine is running. VM is, similar to PM, a set of
virtual machines. Each virtual machine is represented
as a tuple of properties ({incoming bandwidth, out-
going bandwidth, CPU, memory, storage, speed up})
similar to a physical machine except speed up. The
speed up represents a factor that influenced the ex-
pected execution time of a task. The expected execu-
tion time is determined by a default VM configuration
as follows. Let vm
1
, vm
2
VM and vm
1
is the default
VM so the speed up sp
1
of vm
1
is 1. Let the quan-
tity of the available resources for vm
2
be higher than
the quantity of the available resources for vm
1
. So the
speed up of sp
2
is sp
1
. We consider the run time
r
i
of a task t
i
roughly as the product of the speed up
and the expected execution time, so that the run time
is r
i
sp
2
r
i
sp
1
.
The cloud configuration placement CCP is a set
of tuples of assignments from VMs to PMs. Each VM
vm
i
V M must be assigned to exactly one pm
j
PM,
but a PM can contain more than one VM, for example
{(vm
1
, pm
1
), (vm
2
, pm
1
)}.
Now we consider the task placement as a tuple
{TC, CC, TP}. TC is the task configuration and
CC is the cloud configuration. TP is a set of tu-
ples which represent the assignment of the tasks to a
VM and their specified order. TP = {t p
1
, ..., t p
i
} with
t p
i
= {vm
i
, vmt
i
, vmt f
i
}, vm
i
V M, vmt T denotes
the set of tasks that is actually running on vm
i
, and
vmt f T ×T specifies the order constraints on tasks.
TaskPlacementinaCloudwithCase-basedReasoning
325
5 CASE-BASED TASK
PLACEMENT
In case-based reasoning, a problem is solved by find-
ing a similar case stored in the case base and reusing
it in the new problem situation (Aamodt and Plaza,
1994). A case is defined as an old problem including
a solution for this distinct problem. In our approach
a case is a set of two task placements t p
t
and t p
t+1
.
The problem task placement t p
t
is the placement at
time t and the solution task placement t p
t+1
is a re-
configured version of t p
t
at point t + 1. In addition to
t p
t+1
, the solution includes a list of manipulation op-
erators that is required to transform t p
t
into t p
t+1
. To
reuse such a case means to replay the manipulation
operators that have been used to transform t p
t
into
t p
t+1
for the current situation, that is the current task
placement. The similarity of two task placements can
be determined by measuring the number of manipu-
lation operations that is required to transform the one
into the other. In addition, the expected costs, the ex-
pected performance and the expected number of SLA
violations can be compared.
Aamodt and Plaza described the case-based rea-
soning (CBR) process as a cycle with four steps as
shown in Figure 1. The four steps are:
Retrieve the most similar case or cases
Reuse the information and knowledge in that case
to solve the problem
Revise the proposed solution
Retain the parts of this experience likely to be use-
ful for future problem solving
In the retrieve phase, one or more cases with the high-
est similarity to the current problem are chosen. The
similarity of two task placements depend mainly on
the cloud configuration, the SLA requirements and
the task configuration. In a first step we assume
that the SLA requirements are constant. We fur-
ther assume that the task configuration in a case base
can always lead back to the same workflow template
through the workflow manipulation operators intro-
duced in Section 3. But the approach is not limited
to that. In (Minor et al., 2007) it is shown how to de-
fine the similarity between two workflows. The idea
is to represent a workflow as a graph and determine
the similarity between two graphs.
With the assumption of a constant set of SLA
and the workflow template we can define a neighbor-
hood between task placements. The neighborhood be-
tween two task placements is defined by the manip-
ulation operations. Let T P
1
= {TC
1
, CC
1
, T P
1
} and
T P
2
= {TC
1
, CC
1
, T P
2
} be two different task place-
ments with the same task and cloud configuration but
Figure 1: CBR cycle introduced by Aamodt and Plaza.
different task placements and let Dis(T P
1
, T P
2
) be a
function which determines the distance between two
task placements. Be T P
1
={(vm
1
, (t
1
)), vm
2
, (t
2
, t
3
))}
and T P
2
={(vm
1
, (t
1
, t
2
)), vm
2
, (t
3
))}. The distance
between them is the number of manipulation opera-
tions that is required to convert T P
1
into T P
2
. The
difference in this example is that the task t
2
in T P
1
is
running on VM vm
1
and the task t
2
in T P
2
is running
on vm
2
. With the operation ”‘Migrate application to
different VM”’ and the assumption that a task can be
considered as an application, Dis(T P
1
, T P
2
) = 1.
After the retrieve phase follows the reuse phase.
In this phase the task placement will be modified by
the manipulation operators listed in the solution part
of the case. In the revise phase the new case will be
evaluated and in the retain phase the new case will be
stored in the case base.
6 EVALUATION OF THE
CONCEPT
To evaluate the case-based approach we are planning
to measure three values: the performance of the work-
flow execution p, the overall costs oc and the robust-
ness ro for an experimental set of workflows. At
this point, the set T includes all tasks of a workflow.
That means that one task placement covers the whole
workflow. But we consider that when the workflow is
divided into parts of currently active tasks and builds a
task placement for each of this part, the overall costs
will be reduced because the needed infrastructure in
terms of VMs and PMs will be very close to the actu-
ally required resources. So when the workflow is di-
vided into several task placements we call each place-
CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience
326
ment a Task Placement Segment (TPS). However, the
problem with the TPSs is that when there are many
TPSs and they are paired very differently, it will take
more time to reconfigurate the cloud and thus the per-
formance is reduced.
The robustness in our case is an indicator for mea-
suring the changes within a sequence of TPSs. If the
difference between subsequent pairs of TPSs is high,
the robustness will be low. To measure the devia-
tion of a TPS from its predecessor, we determine the
number n of needed manipulation operations to trans-
form one cloud configuration into another, resulting
in q =
1
n+1
100. The robustness value qo of the en-
tire sequence aggregates the particular deviation val-
ues, for instance by a weighted sum with a logarith-
mic factor. The performance of the workflow execu-
tion depends on the expected execution time r
i
of the
tasks, the speed up sp
j
of the VMs, the reconfigura-
tion time w f t
k
of the workflow level operators and the
reconfiguration time cct
k
of the resource level opera-
tors of the current TPS t p
k
. Each TPS includes one
task placement t p
k
. The performance is described as:
p =
n
k=1
max
it p
k
(sp
i
jvmt
i
r
j
)+ w f t
k
+cct
k
. The
value of the expected execution time is not just the
sum of all expected executions within a TPC, because
VMs execute tasks parallel so we search for the max-
imum makespan of the VMs which is max
it p
k
(sp
i
jvmt
i
r
j
). In addition, the speed up factor might con-
sider the type of the task. Some tasks could be more
memory or network intensive than others which leads
to a different speed up for every task depending on
the task himself. As a solution, the speed up could be
determined for a set of reference task. However, this
would require additional effort in classifying work-
flow tasks.
The overall costs are the sum of the costs of
the PMs cpm and the costs of the VMs cvm: oc =
cpm + cvm. cvm =
n
k=1
it p
k
c
i
jvmt
i
r
j
.
n
k=1
is the sum over all TPS,
it p
k
is the sum over all
VMs and c
i
jvmt
i
r
j
is the sum over all products
of the expected execution r
j
and the costs c
i
of VM
i. The costs for the PMs are similar, in short the costs
of the PM pmc multiplied with the highest run time
of the VM assigned to the PM. So the formula is:
cpm =
n
k=1
(max
it p
k
(
jvmt
i
r
j
)pmc). We are plan-
ning to measure these values to assess the target con-
figuration that is suggested by the retrieval results.
7 SUMMARY AND FUTURE
WORK
The task placement problem for workflows in a cloud
environment requires an intelligent solution due to its
complexity. We think the problem is more difficult
than the VM placement problem because it increases
the VM placement problem by an additional layer.
In this paper, we present our approach that uses
case-based reasoning to find good task placements
without a recalculation of the configuration on ev-
ery single step of the workflow. We belief reasoning
techniques are feasible and useful for task placement.
Case-based reasoning is a valid method to develop a
solution.
The work is still in an early phase of development.
It provides a representation and a case-based solution
for the task placement problem. In a next step, we
will finish the implementation of the prototype and
conduct an experimental evaluation. Furthermore, we
will deploy a more dynamic approach to determine
the changes in speed up of a virtual machine when
the manipulation operators change the resources of
the VM. Another issue of our future work is to deter-
mine the granularity in which the task placement seg-
ment should be chosen. To achieve a solution for this,
we will implement a configurable prototype in order
to conduct further experiments with different setups.
The set of values to be measured might be adjusted
after first experimental results have been achieved.
We expect the following benefits of the approach.
The deep integration of workflow management and
cloud management creates novel business opportuni-
ties for cloud providers in the area of cloud-based
workflow services. Furthermore, the preference on
cases with a robust configuration will hopefully re-
duce the re-configuration costs. Additionally, the
number of SLA violations is considered by the ap-
proach and, thus, will most probably be reduced. We
expect a significantly better performance of the work-
flow execution service in comparison to simply mi-
grating a traditional workflow management system
into a cloud infrastructure as a whole. The novel ma-
nipulation operators at workflow level facilitate both
scalability at the task and at the data level.
REFERENCES
Aamodt, A. and Plaza, E. (1994). Case-based reasoning:
Foundational issues, methodological variations, and
system approaches. AI communications, 7(1):39–59.
Jiang, J. W., Lan, T., Ha, S., Chen, M., and Chiang, M.
(2012). Joint VM placement and routing for data cen-
ter traffic engineering. In INFOCOM, 2012 Proceed-
ings IEEE, page 28762880.
Jin, L.-j., Casati, F., Sayal, M., and Shan, M.-C. (2001).
Load balancing in distributed workflow management
system. In Proceedings of the 2001 ACM symposium
on Applied computing, page 522530.
Maurer, M., Brandic, I., and Sakellariou, R. (2013). Adap-
tive resource configuration for cloud infrastructure
TaskPlacementinaCloudwithCase-basedReasoning
327
management. Future Generation Computer Systems,
29(2):472–487.
Minor, M., Schmalen, D., Koldehoff, A., and Bergmann, R.
(2007). Structural adaptation of workflows supported
by a suspension mechanism and by case-based rea-
soning. In Reddy, S. M., editor, Proceedings of the
16th IEEE Internazional Workshop on Enabling Tech-
nologies: Infrastructure for Collaborative Enterprises
(WETICE’07), June 18 - 20, 2007, Paris, France,
pages 370–375. IEEE Computer Society, Los Alami-
tos, California. Best Paper.
Sharma, B., Chudnovsky, V., Hellerstein, J. L., Rifaat, R.,
and Das, C. R. (2011). Modeling and synthesizing
task placement constraints in google compute clusters.
In Proceedings of the 2Nd ACM Symposium on Cloud
Computing, SOCC ’11, page 3:13:14, New York, NY,
USA. ACM.
Tang, C., Steinder, M., Spreitzer, M., and Pacifici, G.
(2007). A scalable application placement controller
for enterprise data centers. In Proceedings of the 16th
International Conference on World Wide Web, WWW
’07, page 331340, New York, NY, USA. ACM.
US Department of Commerce, N. (2011). Final version
of NIST cloud computing definition published. Final
Version of NIST Cloud Computing Definition Pub-
lished.
{Workflow Management Coalition} (1999). Workflow
management coalition glossary & terminology. last
access 05-23-2007.
Wu, Z., Liu, X., Ni, Z., Yuan, D., and Yang, Y. (2013).
A market-oriented hierarchical scheduling strategy in
cloud workflow systems. The Journal of Supercom-
puting, 63(1):256–293.
CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience
328