by edd and constructs the schedule from the end by
exchanging two jobs. Panwalkar et al. (Panwalkar
et al., 1993) proposes constructive local search heuris-
tic PSK, which starts with job set J sorted by spt and
constructs the schedule from the start by exchang-
ing two jobs. Russel and Holsenback (Russell and
Holsenback, 1997) compares PSK and NBR heuris-
tic, and conducted that neither heuristic is inferior
to another one. However, NBR finds a better solu-
tions in more cases. The second group of heuris-
tics is based on Lawler decomposition rule (Lawler,
1977). In this case, heuristic evaluates each child of
the search tree node and the most promising child
is expanded. This heuristic approach is evaluated in
(Potts and Van Wassenhove, 1991) with edd heuris-
tic as a guide for the search. The third group of
heuristics are metaheuristics. (Potts and Van Wassen-
hove, 1991), (Antony and Koulamas, 1996), (Ben-
Daya and Al-Fawzan, 1996) present simulated an-
nealing algorithm for SMTTP. Genetic algorithms
applied to SMTTP are described in (Dimopoulos and
Zalzala, 1999), (S
¨
uer et al., 2012), whereas (Bauer
et al., 1999), (Cheng et al., 2009) propose to use ant
colony optimization for this scheduling problem. All
the reported results in the previous studies are for in-
stance sizes up to 100 jobs. However, these instances
are solvable by the current state-of-the-art exact algo-
rithm in a fraction of second.
2.2 Machine Learning Integration to
Combinatorial Optimization
Problems
The integration of ML to combinatorial optimization
problems has several difficulties. As first, ML models
are often designed with feature vectors having pre-
defined fixed size. On the other hand, instances of
scheduling problems are usually described by a vari-
able number of features, e.g., variable number of jobs.
This issue can be addressed by recurrent networks
and, more recently, by encoder-decoder type of ar-
chitectures. Vinyals (Vinyals et al., 2015) applied an
architecture called Pointer Network that, given a set
of graph nodes, outputs a solution as a permutation of
these nodes. The authors applied the Pointer Network
to Traveling Salesman Problem (TSP), however, this
approach for TSP is still not competitive with the best
classical solvers such as Concorde (Applegate et al.,
2006) that can find optimal solutions to instances with
hundreds nodes in a fraction of second. Moreover, the
output from the Pointer Network needs to be corrected
by the beam-search procedure, which points out the
weaknesses of this end-to-end approach. Pointer Net-
work has achieved optimality gap around 1% for in-
stance with 20 nodes after performing beam-search.
Second difficulty with training a ML model is
with acquisition of training data. Obtaining one train-
ing instance usually requires solving a problem of
the same complexity like the original problem it-
self. This issue can be addressed with reinforcement
learning paradigm. Deudon et al. (Deudon et al.,
2018) used encoder-decoder architecture trained with
REINFORCE algorithm to solve 2D Euclidean TSP
with up to 100 nodes. It is shown that (i) repeti-
tive sampling from the network is needed, (ii) ap-
plying well-known 2-opt heuristic on the results still
improves the solution of the network, and (iii) both
the quality and runtime are worse than classical ex-
act solvers. Similar approach is described in (Kool
and Welling, 2018) which, if it is treated as a greedy
heuristic, beats weak baseline solutions (from the op-
erations research perspective) such as Nearest Neigh-
bor or Christofides algorithm on small instances. To
be competitive in terms of quality with more rele-
vant baselines such as Lin-Kernighan heuristics, they
perform multiple sampling from the model and out-
put the best solution. Moreover, they do not directly
compare their approach with state-of-the-art classi-
cal algorithms while admitting that off-the-shelf Inte-
ger Programming solver Gurobi solves optimally their
largest instances within 1.5 s.
Khalil et al. (Khalil et al., 2017) present an inter-
esting approach for learning greedy algorithms over
graph structures. The authors show that their S2V-
DQN model can obtain competitive results on MAX-
CUT and Minimum Vertex Cover problems. For TSP,
S2V-DQN performs about the same as 2-opt heuris-
tics. Unfortunately, the authors do not compare run-
times with Concorde solver.
Milan et al. (Milan et al., 2017) presents a
data-driven approximation of solvers for N P -hard
problems. They utilized a Long Short-Term Mem-
ory (Hochreiter and Schmidhuber, 1997) (LSTM) net-
work with a modified supervised setting. The reported
results on the Quadratic Assignment Problem show
that the network’s solutions are worse than general
purpose solver Gurobi while having the essentially
identical runtime.
Integration of ML with scheduling problems has
received a little attention so far. Earlier attempts of
integrating neural networks with job-shop scheduling
are (Zhou et al., 1991) and (Jain and Meeran, 1998).
However, their computational results are inferior to
the traditional algorithms, or they are not extensive
enough to assess their quality. An alternative use of
ML in scheduling domain is focused on the criterion
function of the optimization problems. For example,
authors in (V
´
aclav
´
ık et al., 2016) address a nurse ros-
Data-driven Algorithm for Scheduling with Total Tardiness
61