performance, network bandwidth, access policies for
different users, and installed software. Moreover,
sets of resources may change during the execution
process, i.e. some failures can occur or new
resources can be added or excluded from the
environment. Also the stochastic nature of the
computational environment makes it impossible to
predict precisely the amount of computational or
transfer time, even for a single task.
As mentioned previously, the goal of the
scheduling is to minimize makespan. In our previous
work, we identified the following requirements for
workflow scheduling(Nasonov et al. 2014): (a)
processing of dynamic workload without pausing for
rescheduling of operations, (b) consideration of extra
scheduling for incoming workflows without
changing the existing applied plan, (c) operation in a
dynamic distributed environment where resources
can be added at runtime and crashes can occur, (d)
consideration of task execution delays, (e)
processing of workflows’ priorities, and (f)
providing a better solution than traditional heuristics
can generate.
In order to satisfy these requirements,
traditionally two classes of algorithms are used. The
first class is a list-based heuristic such as HPS,
CPOP, PETS, or HEFT(Arabnejad, 2013 and
Topcuoglu, 2002). With some differences, all of
these algorithms of this class perform two main
steps: prioritize and sort all workflow tasks and then
schedule them in ‘task-by-task’ manner according to
assigned priority. The fact that speed of execution
and satisfied quality of solution can be addressed is
one of the advantages of this class.
The second class is meta-heuristics algorithms
such as GRASP, GA, PSO, and ACO (Singh, Singh,
2013). They search through all of solution space and
thus are able to generate final solutions with much
higher quality than list-based heuristics (Rahman et
al., 2013), but in contrast to the previous class they
require much more time to generate solutions with
better quality than list-based algorithms can propose
in similar situations.
The hybrid algorithm proposed in our previous
work combines the advantages of both classes but
still needs to improve convergence in order to be
able to generate better solutions in a hard-limited
time. The extended algorithm will be described in
detail later. Our goal in this work is to investigate
and demonstrate how the convergence and the
performance can be improved with a proposed
novel, nature-inspired approach based on reusing the
inherited population in subsequent runs of the
scheduling algorithm. It is inspired by the idea of
inheritance and survival of populations in the natural
environment when subject to different changes. We
have extended our previously developed hybrid
algorithm with this technique and use multiple
population in order to improve the quality of
generated solutions and to leverage possibilities for
parallelization and increased reliability of GA.
This paper is organized into the following
sections. In Section 2 a review of related works is
presented. Section 3 is concerned with a description
of GAHEFT, the new approach and its application to
the workflow scheduling problem; the multi-
population modification of GAHEFT algorithm
called MPGAHEFT, which leverages potentialities,
is presented there. Section 4 contains an
experimental study of the proposed approach and the
performance of the MPGAHEFT algorithm. In
Section 5 conclusions and future works are
discussed.
2 RELATED WORKS
By our investigations, at the present time, there is no
research that has been completed in the field of
scheduling algorithms in which was addressed the
reuse of inherited populations with an inconsistency
that was produced by some system changes, such as
computational resource fail. We made a review of
works which are the most closely related to our
work.
Rahman et al. (2013) investigated how different
topologies of workflow influence performance of
different kinds of algorithms, including list-based
and meta-heuristics. The authors proposed the idea
of a hybrid algorithm which uses GA to correct the
deadlines of single tasks before DCP-G start, than
DCP-G corrects scheduling during the execution
process; however, there has not been any
experimental study of this technique. There is no
further improvement on the runtime of the generated
solution by the meta-heuristic algorithm.
Xhafa et al. (2008) presented a modification of
cellular memetic algorithm (cMA) to deal with
rescheduling. The algorithm shows good quality of
generated solutions and short execution time that can
be considered suitable for the rescheduling
procedure. But, the proposed approach is adapted
only for batch jobs and can't be applied for
workflows. Also, the executing process pauses each
time there scheduling procedure is performed.
Liu X. et al.(2010) proposed a modification of
ant colony-based(ACO) method and use this strategy
for rescheduling under temporal violations. The
EvolutionaryInheritanceinWorkflowSchedulingAlgorithmswithinDynamicallyChangingHeterogeneousEnvironments
161