Task Placement in a Cloud with Case-based Reasoning

Eric Schulte-Zurhausen and Mirjam Minor

Institute of Informatik, Goethe University, Robert-Mayer-Str.10, Frankfurt am Main, Germany

Keywords:

Workﬂow, Cloud Management, Task Placement.

Abstract:

Moving workﬂow management to the cloud raises novel, exciting opportunities for rapid scalability of work-

ﬂow execution. Instead of running a ﬁxed number of workﬂow engines on an invariant cluster of physical

machines, both physical and virtual resources can be scaled rapidly. Furthermore, the actual state of the re-

sources gained from cloud monitoring tools can be used to schedule workload, migrate workload or conduct

split and join operations for workload at run time. However, having so many options for distributing workload

forms a computationally complex conﬁguration problem which we call the task placement problem.

In this paper, we present a case-based framework addressing the task placement problem by interleaving

workﬂow management and cloud management. In addition to traditional workﬂow and cloud management

operations it provides a set of task internal operations for workload distribution.

1 INTRODUCTION

Rapid scalability is one of the most important char-

acteristics of cloud computing according to the NIST

deﬁnition (US Department of Commerce, 2011). In

classical workﬂow management, scalability of work-

ﬂow execution is achieved by traditional load balanc-

ing. A load balancer distributes the workload across

multiple workﬂow engines in order to maintain an

acceptable performance level of the workﬂow man-

agement system (WFMS) (Jin et al., 2001). Mov-

ing workﬂow management to the cloud raises novel,

exciting opportunities for rapid scalability of work-

ﬂow execution. Instead of running a ﬁxed number

of workﬂow engines on an invariant cluster of phys-

ical machines, both physical and virtual resources

can be scaled rapidly. Furthermore, the actual state

of the resources gained from cloud monitoring tools

can be used to schedule workload, migrate workload

or conduct split and join operations for workload at

run time. However, having so many options for dis-

tributing workload forms a computationally complex

conﬁguration problem which we call the task place-

ment problem. The Workﬂow Management Coalition

({Workﬂow Management Coalition}, 1999) deﬁnes a

workﬂow as followed: The automation of a business

process, in whole or part, during which documents,

information or tasks are passed from one participant

to another for action, according to a set of procedural

rules.

A task also called activity is deﬁned as: A description

of a piece of work that forms one logical step within a

process. An activity may be a manual activity, which

does not support computer automation, or a workﬂow

(automated) activity. A workﬂow activity requires hu-

man and/or machine resources(s) to support process

execution.

In this work we use the term task conﬁguration. A

task conﬁguration is a set of tasks with their relation-

ships and also a set of parameter. These parameters

are not needed for the tasks themselves but for the

handling of the tasks through the system.

A cloud conﬁguration is in our work a set of PMs with

their resources and a set of VMs assigned to the PMs

and sharing some or all of the resources owned by the

PM.

We deﬁne a task placement as the assignment of au-

tomated tasks to virtual machines (VM) and further-

more the assignment of VMs to physical machines

(PM). So a task placement is a mapping of a task con-

ﬁguration to a cloud conﬁguration. Our task place-

ment problem includes two kinds of problems: the job

placement problem as in (Sharma et al., 2011) and a

VM placement problem as in (Jiang et al., 2012). The

job placement problem is the problem to assign jobs

to VMs and the VM placement problem is the prob-

lem to assign VMs to PMs.

In this paper we present a case-based framework ad-

dressing the task placement problem by interleaving

workﬂow management and cloud management. In

323

Schulte-Zurhausen E. and Minor M..

Task Placement in a Cloud with Case-based Reasoning.

DOI: 10.5220/0004944203230328

In Proceedings of the 4th International Conference on Cloud Computing and Services Science (CLOSER-2014), pages 323-328

ISBN: 978-989-758-019-2

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

addition to traditional workﬂow and cloud manage-

ment operations it provides a set of task internal oper-

ations for workload distribution. The notion of a case

is used to represent a task placement problem. A stack

of placement operators allows to organize the case

representation in layers. Instead of solving the task

placement in a computationally complex optimization

problem, case-based reasoning (Aamodt and Plaza,

1994) is used to retrieve similar task placements from

the past that can be reused in the current situation.

In Section 2, we discuss related work from workﬂow

management and cloud management. Section 3 re-

sumes and extends a layered VM placement model

from Maurer et al. (Maurer et al., 2013) that we use

for a task placement model in Section 4. Section 5

describes a case-based recommender system that sug-

gests a set of promising placement operators based

on the experience stored in the case base. Section 6

addresses a planned evaluation. Finally, we discuss

future work and draw some conclusions in Section 7.

2 RELATED WORK

Our deﬁnition of a task placement is different form

the deﬁnition of a task placement used in Sharma et

al. (Sharma et al., 2011). In their work the tasks

are jobs placed on VMs, but the jobs are independent

from each other. Workﬂow tasks are ordered and their

execution order is strict. In their work they describe

a task placement constraint. Not all tasks or jobs can

be placed on all available VMs due to kernel, operat-

ing system or other restrictions. The task placement

constraint restricts the task placement.

Tang et al. describe in their work (Tang et al.,

2007) the problem to allocate applications with dy-

namically changing demands to VMs and how to de-

cide how many VMs must be started. They called this

problem application placement and their problem is

similar to Sharma et al. (Sharma et al., 2011).

The problem to assign workﬂow tasks to VMs

is not new. Wu et al. describe in their work (Wu

et al., 2013) their approach to solve the task placement

problem for scientiﬁc workﬂow with meta-heuristics.

They called this Task-to-VM assignment and imple-

ment SwinDeW-C, a cloud workﬂow system. In their

work (Maurer et al., 2013) Maurer et al. present an

approach to manage a cloud conﬁguration on differ-

ent layers. They divide the resource allocation into

three parts and called them escalation levels.

The ﬁrst level is for individual applications. The

possible actions are:

• Increase / Decrease incoming bandwidth share by

• Increase / Decrease outgoing bandwidth share by

• Increase / Decrease memory by x%

• Increase / Decrease CPU share by x%

• Add / Remove allocated storage by x%

• Outsource (move application) to other cloud

• Insource (accept application) from other cloud

• Migrate application to different VM

The second level is for VMs. The possible actions

are:

• Increase / Decrease incoming bandwidth share by

x%.

• Increase / Decrease outgoing bandwidth share by

x%.

• Increase / Decrease memory by x%.

• Add / Remove allocated storage by x%.

• Increase / Decrease CPU share by x%.

• Outsource (move VM) to other cloud.

• Insource (accept VM) from other cloud.

• Migrate VM to different PM.

And the third level is for physical machines:

• Add x computing nodes.

• Remove x computing nodes.

Maurer et al. use the escalation levels to guarantee the

scalability and efﬁcient resource utilization of their

system in order to fulﬁll the Service Level Agree-

ments. To manage the resources they used two dif-

ferent types of knowledge management - a rule-based

and a case-based reasoning approach.

3 MANIPULATION OPERATORS

In our approach we use a system based on the escala-

tion levels introduced by Maurer et al. (Maurer et al.,

2013). We extend this approach by so-called manip-

ulation operators (MO) that manipulate the task and

cloud conﬁguration that will be introduced in Section

4. Furthermore, we add the workﬂow level on top

of Maurer’s escalation levels namely the application,

VM and PM levels. The resulting hierarchy of levels

is as follows:

• 1: Workﬂow level

• 2: Application level

• 3: Virtual machine level

• 4: Physical machine level

CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience

324

We denote the three lower levels as the resource level.

The resource level includes the application, virtual

and physical machine level. The workﬂow level sup-

ports operations to manipulate a workﬂow so that a

better utilization of the resources can be achieved.

The following workﬂow operations also have a hier-

archical order:

• 1: Task split / join

• 2: Data split / join

• 3: Migrate task to different VM

The task split operation divides a task into a bunch

of parallel tasks with different parameters. For ex-

ample: I = {image

,..., image

} is a set of images.

Two different, mutually independent types of render

algorithms r

(I), r

(I) are given. Task t

(I), r

(I))

should perform both algorithms and task t

evaluates

the result. A task split means that we remove t

and

replace it with two tasks t

and t

, each of which per-

forms only one render algorithm t

(I)), t

(I)).

Because of the split it is now possible to run t

and

on different VMs and a duration speed up is possi-

ble. The task join operation is the inverse action. With

this operator the task t

and t

would be joined to t

A join can be useful to reduce the number of running

VMs or to reduce the number of tasks that run on a

single VM.

The data split operation replicates a task and

divides the data that is to be processed. Sim-

ilar to the previous example, I = {image

,... ,

image

} is a list of images and task r

(I) is an

algorithm to render images. The task t

is the

task to render all images t

(image

, ..., image

)).

The data split operation replicates t

into a bunch

of parallel tasks where each task processes a sub-

set I0 ⊆ I, for example t

(images

, ...images

)),

(images

i+1

, ..., images

)). The data join opera-

tion is the inverse action similar to the task join oper-

ation. It combines t

and t

to t

The migrate operation migrates a task between

VMs. We include this operation despite of the fact

that the migration of a task can be considered an ap-

plication migration. The border between the applica-

tion level and the workﬂow level is more distinct due

to this decision.

4 PLACEMENT MODEL

We model the task placement problem as a mapping

problem between a task conﬁguration and a cloud

conﬁguration. A task conﬁguration is a tuple TC =

{T, P, F}. T is a set of tasks {t

, t

, ..., t

}. P is a set of

tuples {p

, p

, ..., p

} whose elements p

describe the

parameters for task t

. The parameters for each task

can be the expected execution time on a default VM

for each MBit of data volume, the expected upcoming

data volume, the system prerequisites, whether a task

split is possible or not and if a data split is possible

or not. F is a partial order ≺ on T specifying a prece-

dence relation on T. F is induced by the order of tasks

given in the workﬂow. In case that t

and t

are or-

dered in a sequence within the workﬂow, t

can only

be started when t

has ﬁnished execution and, thus,

, t

) is taking part in F.

The cloud conﬁguration CC is a tuple {PM, VM,

CCP}. PM is a set of physical machines. Each

physical machine is represented as a tuple ({incoming

bandwidth, outgoing bandwidth, CPU, memory, stor-

age, costs per time unit}). This is the representation

of the hardware, offered by the physical machine. The

costs per time unit is an abstract value which repre-

sents the costs, particularly energy, that occur when

the machine is running. VM is, similar to PM, a set of

virtual machines. Each virtual machine is represented

as a tuple of properties ({incoming bandwidth, out-

going bandwidth, CPU, memory, storage, speed up})

similar to a physical machine except speed up. The

speed up represents a factor that inﬂuenced the ex-

pected execution time of a task. The expected execu-

tion time is determined by a default VM conﬁguration

as follows. Let vm

, vm

∈ VM and vm

is the default

VM so the speed up sp

of vm

is 1. Let the quan-

tity of the available resources for vm

be higher than

the quantity of the available resources for vm

. So the

speed up of sp

is ≤ sp

. We consider the run time

of a task t

roughly as the product of the speed up

and the expected execution time, so that the run time

is r

∗ sp

≤ r

∗ sp

The cloud conﬁguration placement CCP is a set

of tuples of assignments from VMs to PMs. Each VM

∈ V M must be assigned to exactly one pm

∈ PM,

but a PM can contain more than one VM, for example

{(vm

, pm

), (vm

, pm

)}.

Now we consider the task placement as a tuple

{TC, CC, TP}. TC is the task conﬁguration and

CC is the cloud conﬁguration. TP is a set of tu-

ples which represent the assignment of the tasks to a

VM and their speciﬁed order. TP = {t p

, ..., t p

} with

t p

= {vm

, vmt

, vmt f

}, vm

∈ V M, vmt ⊆ T denotes

the set of tasks that is actually running on vm

, and

vmt f ⊆ T ×T speciﬁes the order constraints on tasks.

TaskPlacementinaCloudwithCase-basedReasoning

325

5 CASE-BASED TASK

PLACEMENT

In case-based reasoning, a problem is solved by ﬁnd-

ing a similar case stored in the case base and reusing

it in the new problem situation (Aamodt and Plaza,

1994). A case is deﬁned as an old problem including

a solution for this distinct problem. In our approach

a case is a set of two task placements t p

and t p

t+1

The problem task placement t p

is the placement at

time t and the solution task placement t p

t+1

is a re-

conﬁgured version of t p

at point t + 1. In addition to

t p

t+1

, the solution includes a list of manipulation op-

erators that is required to transform t p

into t p

t+1

. To

reuse such a case means to replay the manipulation

operators that have been used to transform t p

into

t p

t+1

for the current situation, that is the current task

placement. The similarity of two task placements can

be determined by measuring the number of manipu-

lation operations that is required to transform the one

into the other. In addition, the expected costs, the ex-

pected performance and the expected number of SLA

violations can be compared.

Aamodt and Plaza described the case-based rea-

soning (CBR) process as a cycle with four steps as

shown in Figure 1. The four steps are:

• Retrieve the most similar case or cases

• Reuse the information and knowledge in that case

to solve the problem

• Revise the proposed solution

• Retain the parts of this experience likely to be use-

ful for future problem solving

In the retrieve phase, one or more cases with the high-

est similarity to the current problem are chosen. The

similarity of two task placements depend mainly on

the cloud conﬁguration, the SLA requirements and

the task conﬁguration. In a ﬁrst step we assume

that the SLA requirements are constant. We fur-

ther assume that the task conﬁguration in a case base

can always lead back to the same workﬂow template

through the workﬂow manipulation operators intro-

duced in Section 3. But the approach is not limited

to that. In (Minor et al., 2007) it is shown how to de-

ﬁne the similarity between two workﬂows. The idea

is to represent a workﬂow as a graph and determine

the similarity between two graphs.

With the assumption of a constant set of SLA

and the workﬂow template we can deﬁne a neighbor-

hood between task placements. The neighborhood be-

tween two task placements is deﬁned by the manip-

ulation operations. Let T P

= {TC

, CC

, T P

} and

T P

= {TC

, CC

, T P

} be two different task place-

ments with the same task and cloud conﬁguration but

Figure 1: CBR cycle introduced by Aamodt and Plaza.

different task placements and let Dis(T P

, T P

) be a

function which determines the distance between two

task placements. Be T P

={(vm

, (t

)), vm

, (t

, t

))}

and T P

={(vm

, (t

, t

)), vm

, (t

))}. The distance

between them is the number of manipulation opera-

tions that is required to convert T P

into T P

. The

difference in this example is that the task t

in T P

running on VM vm

and the task t

in T P

is running

on vm

. With the operation ”‘Migrate application to

different VM”’ and the assumption that a task can be

considered as an application, Dis(T P

, T P

) = 1.

After the retrieve phase follows the reuse phase.

In this phase the task placement will be modiﬁed by

the manipulation operators listed in the solution part

of the case. In the revise phase the new case will be

evaluated and in the retain phase the new case will be

stored in the case base.

6 EVALUATION OF THE

CONCEPT

To evaluate the case-based approach we are planning

to measure three values: the performance of the work-

ﬂow execution p, the overall costs oc and the robust-

ness ro for an experimental set of workﬂows. At

this point, the set T includes all tasks of a workﬂow.

That means that one task placement covers the whole

workﬂow. But we consider that when the workﬂow is

divided into parts of currently active tasks and builds a

task placement for each of this part, the overall costs

will be reduced because the needed infrastructure in

terms of VMs and PMs will be very close to the actu-

ally required resources. So when the workﬂow is di-

vided into several task placements we call each place-

CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience

326

ment a Task Placement Segment (TPS). However, the

problem with the TPSs is that when there are many

TPSs and they are paired very differently, it will take

more time to reconﬁgurate the cloud and thus the per-

formance is reduced.

The robustness in our case is an indicator for mea-

suring the changes within a sequence of TPSs. If the

difference between subsequent pairs of TPSs is high,

the robustness will be low. To measure the devia-

tion of a TPS from its predecessor, we determine the

number n of needed manipulation operations to trans-

form one cloud conﬁguration into another, resulting

in q =

n+1

∗ 100. The robustness value qo of the en-

tire sequence aggregates the particular deviation val-

ues, for instance by a weighted sum with a logarith-

mic factor. The performance of the workﬂow execu-

tion depends on the expected execution time r

of the

tasks, the speed up sp

of the VMs, the reconﬁgura-

tion time w f t

of the workﬂow level operators and the

reconﬁguration time cct

of the resource level opera-

tors of the current TPS t p

. Each TPS includes one

task placement t p

. The performance is described as:

p =

∑

k=1

max

i∈t p

(sp

∗

∑

j∈vmt

)+ w f t

+cct

. The

value of the expected execution time is not just the

sum of all expected executions within a TPC, because

VMs execute tasks parallel so we search for the max-

imum makespan of the VMs which is max

i∈t p

(sp

∗

∑

j∈vmt

). In addition, the speed up factor might con-

sider the type of the task. Some tasks could be more

memory or network intensive than others which leads

to a different speed up for every task depending on

the task himself. As a solution, the speed up could be

determined for a set of reference task. However, this

would require additional effort in classifying work-

ﬂow tasks.

The overall costs are the sum of the costs of

the PMs cpm and the costs of the VMs cvm: oc =

cpm + cvm. cvm =

∑

k=1

∑

i∈t p

∗

∑

j∈vmt

∑

k=1

is the sum over all TPS,

∑

i∈t p

is the sum over all

VMs and c

∗

∑

j∈vmt

is the sum over all products

of the expected execution r

and the costs c

of VM

i. The costs for the PMs are similar, in short the costs

of the PM pmc multiplied with the highest run time

of the VM assigned to the PM. So the formula is:

cpm =

∑

k=1

(max

i∈t p

(

∑

j∈vmt

)∗pmc). We are plan-

ning to measure these values to assess the target con-

ﬁguration that is suggested by the retrieval results.

7 SUMMARY AND FUTURE

WORK

The task placement problem for workﬂows in a cloud

environment requires an intelligent solution due to its

complexity. We think the problem is more difﬁcult

than the VM placement problem because it increases

the VM placement problem by an additional layer.

In this paper, we present our approach that uses

case-based reasoning to ﬁnd good task placements

without a recalculation of the conﬁguration on ev-

ery single step of the workﬂow. We belief reasoning

techniques are feasible and useful for task placement.

Case-based reasoning is a valid method to develop a

solution.

The work is still in an early phase of development.

It provides a representation and a case-based solution

for the task placement problem. In a next step, we

will ﬁnish the implementation of the prototype and

conduct an experimental evaluation. Furthermore, we

will deploy a more dynamic approach to determine

the changes in speed up of a virtual machine when

the manipulation operators change the resources of

the VM. Another issue of our future work is to deter-

mine the granularity in which the task placement seg-

ment should be chosen. To achieve a solution for this,

we will implement a conﬁgurable prototype in order

to conduct further experiments with different setups.

The set of values to be measured might be adjusted

after ﬁrst experimental results have been achieved.

We expect the following beneﬁts of the approach.

The deep integration of workﬂow management and

cloud management creates novel business opportuni-

ties for cloud providers in the area of cloud-based

workﬂow services. Furthermore, the preference on

cases with a robust conﬁguration will hopefully re-

duce the re-conﬁguration costs. Additionally, the

number of SLA violations is considered by the ap-

proach and, thus, will most probably be reduced. We

expect a signiﬁcantly better performance of the work-

ﬂow execution service in comparison to simply mi-

grating a traditional workﬂow management system

into a cloud infrastructure as a whole. The novel ma-

nipulation operators at workﬂow level facilitate both

scalability at the task and at the data level.

REFERENCES

Aamodt, A. and Plaza, E. (1994). Case-based reasoning:

Foundational issues, methodological variations, and

system approaches. AI communications, 7(1):39–59.

Jiang, J. W., Lan, T., Ha, S., Chen, M., and Chiang, M.

(2012). Joint VM placement and routing for data cen-

ter trafﬁc engineering. In INFOCOM, 2012 Proceed-

ings IEEE, page 28762880.

Jin, L.-j., Casati, F., Sayal, M., and Shan, M.-C. (2001).

Load balancing in distributed workﬂow management

system. In Proceedings of the 2001 ACM symposium

on Applied computing, page 522530.

Maurer, M., Brandic, I., and Sakellariou, R. (2013). Adap-

tive resource conﬁguration for cloud infrastructure

TaskPlacementinaCloudwithCase-basedReasoning

327

management. Future Generation Computer Systems,

29(2):472–487.

Minor, M., Schmalen, D., Koldehoff, A., and Bergmann, R.

(2007). Structural adaptation of workﬂows supported

by a suspension mechanism and by case-based rea-

soning. In Reddy, S. M., editor, Proceedings of the

16th IEEE Internazional Workshop on Enabling Tech-

nologies: Infrastructure for Collaborative Enterprises

(WETICE’07), June 18 - 20, 2007, Paris, France,

pages 370–375. IEEE Computer Society, Los Alami-

tos, California. Best Paper.

Sharma, B., Chudnovsky, V., Hellerstein, J. L., Rifaat, R.,

and Das, C. R. (2011). Modeling and synthesizing

task placement constraints in google compute clusters.

In Proceedings of the 2Nd ACM Symposium on Cloud

Computing, SOCC ’11, page 3:13:14, New York, NY,

USA. ACM.

Tang, C., Steinder, M., Spreitzer, M., and Paciﬁci, G.

(2007). A scalable application placement controller

for enterprise data centers. In Proceedings of the 16th

International Conference on World Wide Web, WWW

’07, page 331340, New York, NY, USA. ACM.

US Department of Commerce, N. (2011). Final version

of NIST cloud computing deﬁnition published. Final

Version of NIST Cloud Computing Deﬁnition Pub-

lished.

{Workﬂow Management Coalition} (1999). Workﬂow

management coalition glossary & terminology. last

access 05-23-2007.

Wu, Z., Liu, X., Ni, Z., Yuan, D., and Yang, Y. (2013).

A market-oriented hierarchical scheduling strategy in

cloud workﬂow systems. The Journal of Supercom-

puting, 63(1):256–293.

CLOSER2014-4thInternationalConferenceonCloudComputingandServicesScience

328