Improving the Performance of Genetic Algorithms for Combinatorial

Optimization Using Machine Learning for Knowledge Transfer

George Mweshi

and Nelishia Pillay

Department of Computer Science, University of Pretoria, Pretoria, South Africa

Keywords:

Genetic Algorithms, Combinatorial Optimization, Machine Learning, Classiﬁers.

Abstract:

This study investigates improving the performance of genetic algorithms applied to the solution space using

machine learning and knowledge transfer. Genetic algorithms are powerful techniques that have been suc-

cessfully used to explore various problem spaces, such as solution space, program space, and heuristic space.

Recently, researchers have found that transferring knowledge between these spaces can signiﬁcantly enhance

the quality of solutions and reduce computational costs. While this transfer of knowledge works well in pro-

gram and heuristic spaces due to their indirect nature, it is more challenging in the solution space. This is

because each problem in the solution space has its own unique representation, making it difﬁcult to trans-

fer knowledge effectively. This study explores how machine learning, speciﬁcally using classiﬁers, can help

bridge this gap and facilitate knowledge transfer between different solution spaces. We train two classiﬁers,

namely, Support Vector Machines and Random Forests, using data consisting of ﬁtness landscape measures

from a source genetic algorithm to determine if a chromosome is a local optimum or not. This information

is then used during the execution of a target genetic algorithm to identify and remove potential local optima

from the population. We tested this approach on two challenging optimization problems: the examination

timetabling problem (ETP) and the capacitated vehicle routing problem (CVRP). Our results show that this

method provides statistically signiﬁcant improvements over genetic algorithms that do not use knowledge

transfer, both in terms of solution quality and computational efﬁciency. Moreover, we found that random

forests were more effective than support vector machines for transferring knowledge between the source and

target genetic algorithms.

1 INTRODUCTION

Transfer learning involves transferring knowledge

from a source optimization algorithm to a target opti-

mization algorithm with the aim of improvements in

the target domain (Zhuang et al., 2020). The beneﬁts

include improved quality of solutions in the source

domain, reduced computational cost and a reduction

in data needed to solve the problem in the target do-

main. This study focuses on transfer learning in evo-

lutionary algorithms, in particular genetic algorithms

While transfer learning has been explored in ge-

netic algorithms searching the program (Russell and

Pillay, 2023), heuristic (Scheepers and Pillay, 2021),

and design (Nyathi and Pillay, 2021) spaces, its ap-

plication to genetic algorithms exploring the solution

space for single-objective combinatorial optimization

problems remains largely unexplored. The primary

https://orcid.org/0000-0002-3504-3700

https://orcid.org/0000-0003-3902-5582

challenge lies in determining an appropriate mapping

from the source to the target domain, as chromosome

representations in the solution space are problem-

speciﬁc and not problem-domain independent, unlike

in program, heuristic, and design spaces.

This study presents a new domain-independent

approach for addressing the challenge of trans-

ferring knowledge between genetic algorithms us-

ing machine learning. Speciﬁcally, two classiﬁers,

namely, Support Vector Machines (SVM) and Ran-

dom Forests (RF) are trained to identify local op-

tima based on ﬁtness landscape measures such as

ruggedness, neutrality, evolvability, and searchabil-

ity collected from the source genetic algorithm. Once

trained, these classiﬁers are applied to the target ge-

netic algorithm to identify and remove potential lo-

cal optima from its population. The approach was

tested on two complex problems: the examination

timetabling problem (ETP) and the capacitated ve-

hicle routing problem (CVRP). The results indicate

Mweshi, G. and Pillay, N.

Improving the Performance of Genetic Algorithms for Combinatorial Optimization Using Machine Learning for Knowledge Transfer.

DOI: 10.5220/0013084200003837

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 16th International Joint Conference on Computational Intelligence (IJCCI 2024), pages 363-374

ISBN: 978-989-758-721-4; ISSN: 2184-3236

363

that this knowledge transfer method improves perfor-

mance compared to standard genetic algorithms, with

Random Forests outperforming Support Vector Ma-

chines in both problem domains.

The key contribution of this study is:

• A machine learning approach for transfer learning

in genetic algorithms exploring the solution space.

Although we focused on combinatorial optimiza-

tion in this paper, the approach can also be applied to

continuous optimization and machine learning prob-

lems. This research can also be seen as contributing to

the growing effort by the computational intelligence

community to use machine learning to improve the

performance of optimization techniques.

2 BACKGROUND AND RELATED

WORK

This section provides brief overview of genetic al-

gorithms, transfer learning, ﬁtness landscape analy-

sis and machine learning techniques for classiﬁcation.

A discussion on some of the works that have investi-

gated transfer learning in evolutionary algorithms is

also provided.

2.1 Genetic Algorithms

Genetic algorithms (GAs) are optimization methods

inspired by the principles of natural selection (Gold-

berg, 1989). They work by evolving a population of

candidate solutions (which are represented as chro-

mosomes) by iteratively applying the processes of

selection, crossover (recombination), and mutation.

These processes allow for the candidate solutions to

be progressively reﬁned until an optimal or satisfac-

tory result is achieved. For a detailed overview of

GAs, see (Goldberg, 1989). In this study, we used

the generational GA as outlined in Algorithm 1.

2.2 Transfer Learning

Transfer learning (TL) is a technique that involves

transferring knowledge gained from one domain or

task (the source) to improve performance in a differ-

ent but related domain or task (the target) (Zhuang

et al., 2020). This technique is particularly use-

ful when dealing with limited data in the target do-

main, as it leverages the knowledge acquired from the

source domain to enhance the learning process.

In optimization, transfer learning can help by

transferring learned features, patterns, or models from

one optimization problem to another, with the goal

Algorithm 1: Pseudocode for a genetic algorithm.

Data: Population size N

Result: Individual with the highest ﬁtness as

the best solution

Generate a population of individuals of size

Calculate the ﬁtness of each individual in the

population;

while termination criterion is not met do

Select one or two individuals with the

best ﬁtness using a selection method;

Generate new individuals for the next

generation by applying genetic operators

to previously selected individuals;

Evaluate each new individual to

determine its ﬁtness;

Replace all the individuals in the old

population with the new individuals;

end

return Individual with the highest ﬁtness

of improving the convergence times and quality of

the solutions obtained by the optimization algorithms.

TL has been successfully and widely applied in ﬁelds

such as image recognition, natural language process-

ing, and more recently, in various optimization al-

gorithms. For an in-depth discussion of TL and its

diverse applications, please refer to (Zhuang et al.,

2020).

2.3 Fitness Landscape Analysis

Fitness landscape analysis (FLA) is an important

technique for understanding how optimization algo-

rithms, such as GAs, solve complex combinatorial op-

timization problems (Zou et al., 2022). The concept,

introduced by Sewall Wright in 1932 (Wright et al.,

1932) and formalized by Stadler (Stadler, 2002), can

be described as follows:

(X, N, f ) (1)

Where, X represents the set of potential solutions

from the decision variable space, N is the neighbor-

hood operator that deﬁnes the relationships between

solutions, and f is a function that maps solutions to

their ﬁtness values.

In general, FLA helps visualize and understand

how different solutions relate to their ﬁtness values.

Each point in the ﬁtness landscape corresponds to a

potential solution, with its height reﬂecting the so-

lution’s quality. By analyzing this landscape, one

can identify key features such as local and global op-

tima, basins of attraction, and the overall structure of

ECTA 2024 - 16th International Conference on Evolutionary Computation Theory and Applications

364

the solution space. It also provides insights into the

landscape’s ruggedness, neutrality, searchability, and

evolvability (Zou et al., 2022). Some examples of ﬁt-

ness landscapes are shown in Figure 1.

Figure 1: Some examples of ﬁtness landscapes (Hassan and

Pillay, 2022).

2.4 Machine Learning Techniques for

Classiﬁcation

Classiﬁcation is a fundamental task in machine learn-

ing where the goal is to predict the categorical label

of new observations based on training data. Among

the various techniques used for classiﬁcation, Support

Vector Machines (SVMs) (Cervantes et al., 2020),

(Sen et al., 2020) and Random Forests (RFs) (Schon-

lau and Zou, 2020) are two widely adopted methods

due to their effectiveness and versatility.

2.4.1 Support Vector Machines (SVMs)

Support Vector Machines (SVMs) are a power-

ful classiﬁcation technique originally developed by

Vladimir Vapnik and his colleagues (Cortes, 1995).

The core idea of SVMs is to ﬁnd a hyperplane that

best separates different classes in the feature space.

For linearly separable data, this involves identifying a

hyperplane that maximizes the margin—the distance

between the hyperplane and the nearest data points

from each class, known as support vectors. Mathe-

matically, this can be formulated as a quadratic opti-

mization problem:

minimize

∥w∥

(2)

subject to y

+ b) ≥ 1 for all i (3)

where w represents the weights, x

are the feature vec-

tors, y

are the class labels, and b is the bias term.

For non-linearly separable data, SVMs use kernel

functions to transform the input space into a higher-

dimensional space where a linear separation is pos-

sible. Common kernel functions include the polyno-

mial kernel and the radial basis function (RBF) ker-

nel:

K(x

, x

) = exp



−

∥x

− x

∥

2σ



(4)

where σ is a parameter that controls the spread of the

kernel function.

SVMs are well-regarded for their generalization

capabilities and have been successfully applied in var-

ious domains such as image recognition, text classiﬁ-

cation, and bioinformatics (Cortes, 1995).

2.4.2 Random Forests (RFs)

Random Forests (RFs) are an ensemble learning

method that combines multiple decision trees to im-

prove classiﬁcation performance (Breiman and Cut-

ler, 2001). A RF constructs multiple decision trees

during training and outputs the class that is the mode

of the classes (for classiﬁcation) or the mean predic-

tion (for regression) of the individual trees. Each

tree is built from a bootstrap sample of the training

data, and at each split in the tree, a random subset

of features is considered, which introduces additional

randomness and improves the model’s generalization

ability.

The algorithm can be summarized as follows:

1. Draw B bootstrap samples from the training data.

2. For each bootstrap sample, grow a decision tree

using a random subset of features at each node.

3. Aggregate the predictions of all trees to make the

ﬁnal classiﬁcation decision.

The ensemble approach of RFs helps to reduce

variance and prevent overﬁtting, making them highly

effective for a wide range of classiﬁcation tasks, in-

cluding those involving large and complex datasets

(Breiman and Cutler, 2001).

2.5 Transfer Learning in Genetic

Algorithms

As mentioned earlier, this paper investigates the appli-

cation of transfer learning in genetic algorithms ex-

ploring the solution space for single-objective com-

binatorial optimization. Speciﬁcally, it employs ma-

chine learning, using classiﬁcation to transfer knowl-

edge between source and target genetic algorithms.

Improving the Performance of Genetic Algorithms for Combinatorial Optimization Using Machine Learning for Knowledge Transfer

365

Previous research on transfer learning in evolution-

ary algorithms has predominantly focused on multi-

objective optimization.

Jiang et al. (Jiang et al., 2020b) introduced

MMTL-DMOEA, a memory-driven manifold transfer

learning-based evolutionary algorithm for dynamic

multi-objective optimization. By integrating memory

mechanisms with manifold transfer learning, their al-

gorithm signiﬁcantly enhanced solution quality and

reduced computational costs. Jiang et al. (Jiang

et al., 2020a) proposed KT-DMOEA, a knee point-

based imbalanced transfer learning method that trans-

fers predicted knee points to reduce computational

cost, leading to substantial improvements in solution

quality.

Liu and Wang (Liu and Wang, 2021) combined a

population prediction strategy (PPS) with a transfer

learning-based dynamic multi-objective evolutionary

algorithm (Tr-DMOEA) to address dynamic multi-

objective optimization problems (DMOPs). This

hybrid approach outperformed both PPS and Tr-

DMOEA by effectively utilizing historical informa-

tion to initialize populations in new environments.

Jiang et al. (Jiang et al., 2017) presented a frame-

work integrating transfer learning with evolutionary

algorithms to tackle DMOPs. By generating an ini-

tial population pool from past experiences and ap-

plying population-based evolutionary algorithms, this

approach notably enhanced the performance of sev-

eral well-known algorithms on benchmark functions.

Huang et al. (Huang et al., 2023) intro-

duced a transfer learning-based evolutionary algo-

rithm (TLEA) framework for multi-objective opti-

mization problems. This framework decomposes

complex problems into manageable subtasks, opti-

mizing them collaboratively through transfer learn-

ing, and demonstrated superior performance on

benchmark problems.

Zhang et al. (Zhang et al., 2023) transferred

knowledge from an evolutionary algorithm-neural

network hybrid solving low-order problems to the hy-

brid solving high-order functions, utilizing the optima

found by the evolutionary algorithm and neural net-

work model for low-order problems.

Unlike previous works, this study transfers knowl-

edge from a genetic algorithm exploring the solu-

tion space for combinatorial optimization. A classi-

ﬁer is employed to learn and identify local optima in

the source genetic algorithm, which are then removed

from the population in the target genetic algorithm.

3 PROPOSED GA APPROACH

FOR KNOWLEDGE TRANSFER

This section discusses the proposed GA approach

used to solve the ETP and CVRP. We start by dis-

cussing the two problem domains and this is then

followed by a discussion of how the GA was im-

plemented. Finally, we explain how knowledge was

transferred between the GAs to improve performance.

3.1 Problem Domains

This study considered two combinatorial optimization

problems, namely, the examination timetabling prob-

lem (ETP) and the capacitated vehicle routing prob-

lem (CVRP).

3.1.1 Examination Timetabling Problem (ETP)

The ETP is a well-known optimization problem that

involves scheduling exams within speciﬁed periods

and rooms while strictly adhering to hard constraints

and minimizing violations of soft constraints. In this

study, we used the ITC 2007 examination timetabling

benchmark set. The hard constraints for the bench-

mark set include:

• Ensuring that no student is scheduled to take more

than one exam at the same time.

• Ensuring that the number of students assigned to

a venue does not exceed its capacity.

• Guaranteeing that the duration of an exam ﬁts

within the allocated period.

• Respecting period-related constraints, such as

scheduling one exam before another in the se-

quence.

• Satisfying room-related constraints, such as as-

signing exams to speciﬁc venues.

The soft constraints for the benchmark set include:

• Minimizing instances where students have to take

two exams back-to-back or on the same day.

• Reducing the clustering of exams to ensure a more

even distribution for students.

• Avoiding the scheduling of exams with mixed du-

rations in the same period.

• Preferentially scheduling larger exams later in the

timetable and minimizing the use of speciﬁc peri-

ods and rooms.

A feasible timetable is one that satisﬁes all hard

constraints. The objective value of a timetable is

calculated as the total cost of the violated soft con-

straints, as shown in Equation (5).

ECTA 2024 - 16th International Conference on Evolutionary Computation Theory and Applications

366

ETP

soft

∑

i=1

soft

(i) · S(i) (5)

where, n

soft

represents the total number of soft con-

straints, and S(i) represents the number of violations

for soft constraint i.

The goal is to create a timetable that violates no

hard constraints while minimizing the number of soft

constraint violations. The characteristics of the data

instances in the ITC 2007 benchmark set are shown

in Table 1.

Table 1: Characteristics of the ITC2007 ETP benchmark

instances.

Instance Exams Students Periods Conﬂict Density Rooms

1 607 7891 54 0.05 7

2 870 12743 40 0.01 49

3 934 16439 36 0.03 48

4 273 5045 21 0.15 1

5 1018 9253 42 0.009 3

6 242 7909 16 0.06 8

7 1096 14676 80 0.02 15

8 598 7718 80 0.05 8

9 169 655 25 0.08 3

10 214 1577 32 0.05 48

11 934 16439 26 0.03 40

12 78 1653 12 0.18 50

Conﬂict Density: number of conﬂicts / (number of exams)

3.1.2 Capacitated Vehicle Routing Problem

(CVRP)

The CVRP on the other hand involves ﬁnding the

most cost-effective set of routes for delivering goods

to a group of customers while meeting strict con-

straints. In the study, we used the Christoﬁdes and

Golden benchmark sets. The hard constraints associ-

ated with these benchmarks include:

• The vehicle must start its route at the depot and

return to the depot after completing all deliveries.

• The total demand on a route must not exceed the

vehicle’s capacity.

• Each customer must be visited exactly once on a

route.

• The duration of any route must not exceed a spec-

iﬁed global maximum.

The main objective is to minimize the total cost

of the route set. The objective value of a solution is

calculated by summing the costs of all routes, which

include the distances between customers and the ser-

vice time for each customer, as shown in Equation (6).

CVRP

∑

i=1

∑

j=1

i j

∑

i=1

(6)

where n is the total number of customers.

The characteristics of the data instances in the

Golden and Christoﬁdes benchmark sets are provided

in Table 2 and Table 3 respectively.

Table 2: Characteristics of the Golden Benchmark set.

Instances Capacity Customers Max. length Service time Vehicles

1 550 240 650 0 10

2 700 320 900 0 10

3 900 400 1200 0 10

4 1000 480 1600 0 12

5 900 200 1800 0 5

6 900 280 1500 0 8

7 900 360 1300 0 9

8 900 440 1200 0 11

9 1000 255 ∞ 0 14

10 1000 323 ∞ 0 16

11 1000 399 ∞ 0 18

12 1000 482 ∞ 0 19

13 1000 252 ∞ 0 27

14 1000 320 ∞ 0 30

15 1000 396 ∞ 0 34

16 1000 480 ∞ 0 38

17 200 240 ∞ 0 22

18 200 300 ∞ 0 22

19 200 360 ∞ 0 33

20 200 420 ∞ 0 41

Table 3: Characteristics of the Christoﬁdes Benchmark set.

Instances Capacity Customers Max. length Service time Vehicles

1 160 51 ∞ 0 5

2 140 76 ∞ 0 10

3 200 101 ∞ 0 8

4 200 151 ∞ 0 12

5 200 200 ∞ 0 17

6 160 51 200 10 6

7 140 76 160 10 11

8 200 101 230 10 9

9 200 151 200 10 14

10 200 200 200 10 18

11 200 121 ∞ 0 7

12 200 101 ∞ 0 10

13 200 121 720 50 11

14 200 101 1040 90 11

3.2 Genetic Algorithm for ETP and

CVRP

The GA used in this study is a generational algo-

rithm where offspring replace parents in each genera-

tion (Goldberg, 1989). The pseudocode for the GA is

shown in Algorithm 1.

3.2.1 Chromosome Representation

For the ETP, a chromosome typically represents an

exam schedule, specifying the time slots and rooms

assigned to each exam. In this study, the chromosome

was encoded as an integer sequence with the value for

each gene determined using a simple encoding func-

tion. The length of the chromosome was equal to the

number of exams to be scheduled and these exams

were arranged in ascending order within the chromo-

some. For example, the following chromosome ’10

219 374 362 226 221’ represents an exam schedule

with 6 exams. The ﬁrst exam is assigned an integer

Improving the Performance of Genetic Algorithms for Combinatorial Optimization Using Machine Learning for Knowledge Transfer

367

value of 10, the second exam a value of 219, the third

exam a value of 374 and so on. The integer values are

obtained by using a simple encoding function shown

in Equation 7 below:

Int Value = rIndex × numPeriods +tsIndex (7)

where:

• rIndex refers to the index of the most suitable

room for the exam,

• numPeriods represents the total number of avail-

able periods, and

• tsIndex indicates the most appropriate period for

the exam.

The decoding process is as follows: each integer value

in the chromosome is decoded into its corresponding

room and period using Equation 8 and 9. The room

index (rIndex) is calculated by dividing the integer

by the total number of periods, and the time slot index

(tsIndex) is derived from the remainder of this divi-

sion. This decoding allows us to accurately extract

the room and period for each exam, thus reconstruct-

ing the exam schedule from the chromosome.

rIndex =



Int Value

numPeriods



(8)

tsIndex = Int Value mod numPeriods (9)

For the Capacitated Vehicle Routing Problem

(CVRP), the chromosome consisted of a vector of in-

tegers of length N, representing the number of cus-

tomers to be served. Each gene in the vector corre-

sponded to a given customer. The sequence of genes

in the vector determined the service order of cus-

tomers, and the set of customers that made up each

route was limited by the capacity of the vehicles. That

is, each customer was assigned to a speciﬁc vehicle

and when the vehicle’s capacity was exceeded, a new

route was started. The chromosome structure used is

shown in Figure 2.

Figure 2: Solution encoding scheme using a vector of inte-

gers.

3.2.2 Population Initialization

We used two different approaches to generate the ini-

tial population for the the ETP and CVRP. For the

ETP, the initial population is generated randomly.

However, we implemented a special method called

hard constraint solver, whose sole purpose was to gen-

erate random initial solutions that did not violate any

hard constraints for the problem instances under con-

sideration. This approach helped to signiﬁcantly re-

duce the runtime for the GA. The pseudocode for the

hard constraint solver is shown in Algorithm 2.

Algorithm 2: Hard Constraint Solver.

Data: List of exams, available rooms,

available periods

Result: Solution satisfying hard constraints

Begin with an empty scheduling solution;

Randomize the order of exams;

while there are unscheduled exams do

Choose an unscheduled exam;

Verify its scheduling feasibility without

violating constraints, prioritizing based

on coincidence and precedence

constraints;

Identify suitable periods and rooms for

scheduling the exam;

Allocate the exam to a period and room;

Update the solution with the scheduled

exam;

end

return The updated scheduling solution

The classical and widely used Clarke-Wright

(Lysgaard, 1997) heuristic method was used to gen-

erate the initial population for the CVRP.

3.2.3 Fitness Evaluation

We used Equation (5) and Equation (6) as ﬁtness

functions for the ETP and CVRP respectively.

3.2.4 Selection Method

Tournament selection (Yadav and Sohal, 2017) was

used to select parents for the genetic operators.

3.2.5 Genetic Operators

We used the two-point crossover operator for the ETP.

This approach was favored for its simplicity in im-

plementation and its ability to introduce increased

diversity among the offspring compared to the one-

point crossover. Additionally, we used the partially-

mapped crossover (PMX) and order crossover (OX)

operators (Ahmed et al., 2023) for the CVRP. For mu-

tation, we used the following operators: Swap, Two-

opt, scramble, inversion and displacement operators.

More information on these mutation operators can be

found in (Daglayan and Karakaya, 2016).

ECTA 2024 - 16th International Conference on Evolutionary Computation Theory and Applications

368

3.2.6 Replacement Strategy

Although elitism is a commonly used strategy in GAs,

it was not used in this studyin orderto maintain focus

on the effects of machine learning-based knowledge

transfer. Future work will explore the incorporation

of elitism to examine its impact on preserving the best

individuals and whether it enhances the GA’s ability

to learn local optima more effectively.

3.2.7 Termination Criteria

For both problems, we set the termination criteria to

be the maximum number of generations. We experi-

mented with various values for this and the best values

are shown in Table 4.

3.3 Knowledge Transfer in the GA

Solution Space

This section describes how knowledge about the local

optima in the solution space was transferred between

the source GA and the target GA.

3.3.1 Fitness Landscape Measures

The following ﬁtness landscape measures: autocorre-

lation, correlation length, neutrality, and evolvability

were used to classify whether a particular element of

the population was a local optimum or not. These

measures were selected because they provide single

numerical values for each solution in the population,

thereby making it easier to quantify each solution in

terms its ﬁtness landscape measures.

• Autocorrelation and Correlation Length (Brandt,

2001), (Merkuryeva and Bolshakovs, 2011):

These measures were used to assess the rugged-

ness of the ﬁtness landscape. Ruggedness refers

to the variability or “roughness” of the ﬁtness

landscape. High ruggedness implies the pres-

ence of many local optima, making it challeng-

ing for optimization algorithms to ﬁnd the global

optimum. Generally, autocorrelation examines

the similarity between values at different points,

while correlation length measures the distance

over which points are correlated. A rapidly

decaying autocorrelation function suggests high

ruggedness and a high likelihood of local optima.

Conversely, a short correlation length implies a

rugged landscape with frequent local optima, in-

dicating that a solution in such a region is likely

near a local optimum.

• Neutrality: Neutrality refers to the extent to which

small changes in a solution do not result in

changes in ﬁtness. In a neutral landscape, many

neighboring solutions have the same or similar ﬁt-

ness. The Average Neutrality Ratio (ANR) (Van-

neschi et al., 2006) was used to quantify neutral-

ity. The neutrality ratio of a point is the proportion

of neutral neighbors to the total neighbors of that

point, and ANR is the average of these neutrality

ratios across the landscape. Higher ANR values

indicate a more neutral landscape.

• Evolvability: Evolvability reﬂects the ability of

a population to improve its ﬁtness over genera-

tions. A highly evolvable landscape allows the

algorithm to navigate towards higher ﬁtness re-

gions efﬁciently. Evolvability was measured us-

ing the Accumulated Escape Probability (AEP)

(Lu et al., 2011), derived from the Fitness Prob-

ability Change (FPC), which is the mean of the

average escape probabilities from different points.

Higher AEP values indicate a more evolvable

landscape.

Local optima are identiﬁed by comparing a solu-

tion’s ﬁtness with the ﬁtness of its neighboring so-

lutions within the ﬁtness landscape. A solution is

deemed to be a local optimum if its ﬁtness is lower(in

our case) than that of all its neighbors. This approach

helps the GA avoid being trapped in suboptimal re-

gions of the search space. While this method may

resemble hill-climbing, our focus was on detecting

and eliminating such local optima through the learn-

ing process, rather than performing traditional hill-

climbing.

3.3.2 Source Genetic Algorithm

During execution of the source GA, a dataset was cre-

ated. The dataset was built by collecting solutions

found in the ﬁnal population at different stages of the

algorithm’s run. We speciﬁcally looked at solutions

found after 10, 15, 20, 25, 30, 35, 40, and 45 genera-

tions. Each data instance in this dataset represented a

single solution. It consisted of two parts as shown in

Figure 3:

• Features: A set of independent variable values de-

rived from ﬁtness landscape measures. These fea-

tures described the characteristics of the solution.

• Target: A corresponding target value indicating

whether the solution is a local optimum (yes or

no). We used the following binary values (1 for

local optimum, 0 for not).

3.3.3 Model Training and Testing

The machine learning algorithms (SVM and RF) were

trained using the data obtained after executing the

Improving the Performance of Genetic Algorithms for Combinatorial Optimization Using Machine Learning for Knowledge Transfer

369

Figure 3: Example of a dataset with FLA features.

source Genetic Algorithm (GA) (refer to 3.3.2). The

dataset was then split into training and testing sets

with a 80-20 ratio. The training set was used to train

the models, while the testing set was used to evalu-

ate their performance. Min-Max Normalization was

applied to ensure that all input variables contributed

equally to the model training. Grid search (Sun et al.,

2021) was used to ﬁnd the optimal hyperparameters

both SVM and RF. The best hyperparameter values

for the two classiﬁers are listed in Table 4.

Model Evaluation. The performance of the models

was assessed using the following metrics:

• Accuracy. The proportion of correctly predicted

instances out of the total instances.

• Precision. The ratio of true positive predictions

to the total predicted positives.

• Recall (Sensitivity or True Positive Rate). The

ratio of true positive predictions to the total actual

positives. It shows the model’s ability to ﬁnd all

the positive instances.

• F1 Score. The harmonic mean of precision and

recall, used to balance the trade-off between the

two when they are not equally important

The next section describes how the models produced

were used in the target genetic algorithm.

3.3.4 Target Genetic Algorithm

Knowledge from the source GA was transferred to

the target GA in the form of two classiﬁer mod-

els with each model consisting of FLA10, FLA15,

FLA20, FLA25, FLA30, FLA35, FLA40 and FLA45

sub models. If a solution was determined to be a lo-

cal optimum, it was subsequently eliminated from the

population. This was done in order to allow the tar-

get GA to look for better solutions by avoiding those

solutions that were likely to lead to a local optima. In

addition, by eliminating these solutions from the pop-

ulation, our hope was that the convergence time for

the target GA would be improved as well.

4 EXPERIMENTAL SETUP

This section provides an overview of the experimental

setup used to assess the effectiveness of the GA-FLA.

4.1 Source and Target Problem

Instances

The problem instances in Section 3.1 were divided

into source and target sets. The GA was executed for

each problem instance in the source set, generating

data over 10, 15, 20, 25, 30, 35, 40 and 45 genera-

tions for training the classiﬁcation models.

For the ETP, we opted for a source domain set

comprising instances that were computationally less

challenging to solve:

• Source set instances: 4, 8, 11, 12

• Target set instances: 1, 2, 3, 5, 6, 7, 9, 10

For the CVRP, we implemented a simple K-Means

clustering method (Sinaga and Yang, 2020) on the

benchmark datasets to group instances sharing com-

mon characteristics. This strategy was preferred over

random selection, as our initial trials showed that

random selection frequently resulted in poor results.

Subsequently, we selected representative instances

from each cluster to bolster the source domain.

• Golden

– Source set instances: 1, 4, 8, 12, 18, 20

– Target set instances: 2, 3, 5, 6, 7, 9, 10, 11, 13,

14, 15, 16, 17, 19

• Christoﬁdes

– Source set instances: 3, 4, 9, 10, 13

– Target set instances: 1, 2, 5, 6, 7, 11, 14

4.2 Experiments

In order to evaluate the performance of GA-FLA, the

following two experiments were conducted:

• Experiment 1: GA-FLA Comparison with Data

from Different Generations - Compares the per-

formance of the GA-FLA with the classiﬁer learn-

ing with data from 10, 15, 20, 25, 30, 35, 40,

and 45 generations. We also include an ensem-

ble model which intelligently combines all the sub

models(i.e. 10, 15, 20, 25, 30, 35, 40, and 45 mod-

els) using majority voting.

• Experiment 2: GA and GA-FLA Performance

Comparison - This experiment compares the per-

formance of the genetic algorithm without trans-

fer learning (GA) to the that of the genetic algo-

rithm with transfer learning (GA-FLA ) for both

the problem domains.

ECTA 2024 - 16th International Conference on Evolutionary Computation Theory and Applications

370

Due to the stochastic nature of the algorithms 30 runs,

each with a different random number generator seed

is performed for each problem instance for both the

source and target GAs.

It is important to note that fairness in computa-

tional budget was ensured by using the same number

of generations (50) for both the standard GA and the

GA-FLA. While additional machine learning-based

steps were introduced for GA-FLA, the overall eval-

uation time included this training and classiﬁcation

process. Therefore, the reported times reﬂect the ac-

tual computational cost of both approaches fairly.

4.3 Statistical Tests

We conducted hypothesis testing using the Z statistic

to assess whether there was a signiﬁcant difference in

the performance between GA-FLA and a basic GA. A

conﬁdence level of 95% (i.e., p-values below 0.05 are

statistically signiﬁcant) was used for both statistical

tests.

4.4 GA and Classiﬁer Parameters

The GA approach was implemented using the ECJ

(Evolutionary Computation in Java) toolkit (Luke,

1998) while the Weka (Bouckaert et al., 2016) toolkit

was employed for the SVM and RF implementations.

To identify the most effective parameter values,

we adopted an empirical approach involving system-

atic experimentation and iterative adjustment of pa-

rameters. The best parameter values obtained during

the ﬁne-tuning process are listed in Table 4.

Table 4: Best parameter values.

Technique Parameter Value

Parameter Value

GA Population Size 100

Crossover Rate 0.8

Mutation Rate 0.1

Selection Operator Tournament

Tournament Size 3

Crossover Operator Two-Point / OX

Mutation Operator Two-Opt

Max. Generations 50

SVM Kernel Type RBF (Radial Basis Function)

Cost 1.0

Gamma 0.01

Cache Size 100

RF Number of Trees 100

Maximum Depth 0 (unlimited)

Minimum Size for Split 2

4.5 Technical Speciﬁcations

The the source and target GAs were executed on

a computing system with the following hardware

conﬁguration: An Intel Core i7 octa-core proces-

sor (clocked at 2.8 GHz), 16GB RAM, NVidia

GeforceRTX 2080 GPU and a 500GB SSD.

5 RESULTS AND DISCUSSION

This section compares the performance of the GA-

FLA on the two experiments outlined in section 4.2.

5.1 Experiment 1: GA-FLA

Comparison with Data from

Different Generations

This section compares the performance of GA-FLA

using classiﬁers trained with data from different

stages of the source GA. While data was collected at

multiple intervals, speciﬁcally at generations 10, 15,

20, 25, 30, 35, 40, and 45, only the results from the

best-performing generations are shown in tables 5 to

10 for clarity and relevance.

Table 5: GA-FLA Comparison with RF for ETP.

Instance Gen30 Ensemble

Best Average Time(secs) Best Average Time(secs)

1 4928 5401 2300 4792 5120 2300

2 482 512 2300 432 490 2300

3 7930 8359 2300 7831 8210 2300

5 2647 2901 3600 2602 2890 3600

6 25926 26870 2300 25730 26640 2300

7 4060 4410 3600 3922 4200 3600

9 972 1080 2300 965 1010 2300

10 13390 14214 3600 13222 14080 3600

Table 6: GA-FLA Comparison with SVM for ETP.

Instance Gen30 Ensemble

Best Average Time(secs) Best Average Time(secs)

1 5170 5860 2300 4930 5310 2300

2 572 599 2300 490 540 2300

3 8190 8730 2300 8002 8480 2300

5 2710 3050 3600 2678 2950 3600

6 26065 26941 2300 25901 26872 2300

7 4100 4560 3600 4042 4431 3600

9 985 1092 2300 974 1040 2300

10 13572 14650 3600 13410 14360 3600

The best objective value and average objective

value over the 30 runs is listed. It is evident from the

tables that the ensemble classiﬁer outperforms the in-

dividual classiﬁers. This was found to be statistically

signiﬁcant at a 95% level of conﬁdence for the ETP.

However, for the CVRP, the results were found to be

signiﬁcant only at the 90% level of conﬁdence. There

is also no difference in computational cost despite the

improvement in performance. These results also indi-

cate that the RF classiﬁer produced better results than

the SVM classiﬁer.

While the maximum number of generations was

set to 50, which may appear conservative for large

Improving the Performance of Genetic Algorithms for Combinatorial Optimization Using Machine Learning for Knowledge Transfer

371

Table 7: GA-FLA Comparison with RF for CVRP Golden

Data Set.

Instance Gen30 Ensemble

Best Average Time(secs) Best Average Time(secs)

2 8591.5 8810.6 372 8560.1 8790.3 372

3 11508.6 12090.3 406 11445.7 11847.5 406

5 7909.5 8330.7 367 7742.5 8120.4 367

6 8872.9 9005.3 405 8765.8 8944.0 405

7 11064.2 11428.1 509 10922.3 11260.2 509

9 710.7 795.2 468 685.4 750.1 468

10 799.8 880.6 490 741.6 830.5 490

11 1103.4 1198.1 578 1024.9 1130.3 578

13 1005.2 1150.7 703 939.1 970.2 703

14 1240.2 1301.4 512 1190.8 1242.8 512

15 1529.8 1608.4 950 1465.7 1550.1 950

16 1689.2 1790.5 1081 1670.2 1710.8 1081

17 1065.3 1102.1 641 920.5 973.2 641

19 1540.5 1608.7 394 1490.2 1580.3 394

Table 8: GA-FLA Comparison with SVM for CVRP

Golden Data Set.

Instance Gen30 Ensemble

Best Average Time (secs) Best Average Time (secs)

2 8760.3 8970.1 372 8630.2 8891.2 372

3 12029.5 12292.7 406 12009.4 12200.9 406

5 8220.6 8430.2 367 8040.1 8310.9 367

6 9001.4 9371.2 405 8937.7 9230.4 405

7 11640.1 11822.5 509 11368.6 11629.8 509

9 790.5 810.4 468 710.6 770.3 468

11 1105.4 1203.6 578 1073.5 1150.7 578

13 1022.1 1085.4 703 957.3 990.6 703

14 1303.8 1392.7 512 1243.3 1347 512

15 1594.3 1640.2 950 1530.6 1597.1 950

16 1684.6 1765.2 1081 1680.3 1740.8 1081

17 1104.6 1197.5 641 933.0 1003.2 641

19 1646.2 1699.5 394 1500.1 1621.6 394

Table 9: GA-FLA Comparison with RF for CVRP

Christoﬁdes Data Set.

Instance Gen30 Ensemble

Best Average Time (secs) Best Average Time (secs)

1 651.7 694.1 100 630.7 660.3 100

2 850.3 870.2 270 847.5 864.1 270

5 1420.5 1543.2 430 1399.2 1520.8 430

6 846.8 870.4 220 790.7 843.1 220

7 965.2 970.3 286 960.4 970.1 286

8 920.6 1080.4 301 911.3 1042.1 301

11 1280.8 1450.3 343 1224.4 1408.4 343

12 970.4 1022.1 323 949.2 1000.5 323

14 960.5 990.2 318 928.9 967.1 318

Table 10: GA-FLA Comparison with SVM for CVRP

Christoﬁdes Data Set.

Instance Gen30 Ensemble

Best Average Time (secs) Best Average Time (secs)

1 680.2 690.3 100 650.4 667.3 100

2 855.4 890.6 270 852.1 870.5 270

5 1550.1 1599.5 430 1432.4 1499.7 430

6 840.1 877.3 220 822.2 869.4 220

7 970.4 995.7 286 968.3 985.4 286

8 930.1 1010.2 301 915.7 999.6 301

11 1170.5 1255.8 343 1120.6 1245.2 343

12 910.9 1090.1 323 899.4 995.3 323

14 972.4 1008.3 318 962.8 980.5 318

problem instances, this choice was made to assess

the GA’s ability to ﬁnd better solutions within a lim-

ited computational budget. The results indicate that

the GA-FLA effectively reached better optima within

this constraint. Future work will explore running the

algorithms with higher generation limits to evaluate

whether additional improvements can be achieved.

5.2 Experiment 2: GA and GA-FLA

Performance Comparison

This section compares the performance of the GA,

i.e. the GA without transfer learning to the GA-FLA

with the best classiﬁer, namely, the ensemble, from

the previous section. Tables 11, 12 and 13 compare

the best results obtained by both GA and GA-FLA

for the ETP, CVRP Golden and CVRP Christoﬁdes

datasets respectively. The percentage of improvement

(∆(%)), if any, is calculated using Eq. (10).

∆(%) =

best

− best

GA−FLA

best

∗ 100 (10)

Where best

is the objective value of the best solu-

tion obtained by the GA approach and best

GA−FLA

the objective value of the best solution obtained by the

GA-FLA approach

The results indicate that GA-FLA outperforms

GA without an increase in computational cost for both

problems. This result was found to be signiﬁcant at

the 95% level of signiﬁcance

Table 11: GA vs. GA-FLA with RF for ETP.

Instance GA GA-FLA ∆(%)

Best Average Time(secs) Best Average Time(secs)) Best

1 6770 7430 2600 4792 5120 2300 29.2

2 793 990 2600 432 490 2300 45.5

3 8769 9320 2600 7831 8210 2300 10.7

5 3413 3802 4000 2602 2890 3600 23.8

6 28330 30450 2600 25730 26640 2300 9.2

7 5535 7020 4000 3822 4200 3600 30.9

9 1092 1580 2600 965 1010 2300 11.6

10 14053 16860 4000 13222 14080 3600 5.9

Table 12: GA vs. GA-FLA with RF for CVRP Golden Data

Set.

Instance GA GA-FLA ∆(%)

Best Average Time(secs) Best Average Time(secs)) Best

2 9560.9 9822.8 975 8560.1 8790.3 372 10.5

3 12251.5 12508.6 992 11445.7 11847.5 406 6.6

5 9005.7 9339.2 696 7742.5 8120.4 367 14

6 9937.7 10154.9 861 8765.8 9154.9 405 11.8

7 12368.6 13629.8 987 10922.3 11860.2 509 11.7

9 1009.0 1115.3 883 685.4 750.1 468 32

10 1001.5 1020.8 889 741.6 830.5 490 26

11 1373.5 1459.5 1017 1024.9 1130.3 578 25.4

13 1342.7 1440.5 1197 939.1 970.2 703 30.1

14 1597.2 1626.5 924 1190.8 1242.8 512 25.4

15 1881.1 1927.4 1130 1465.7 1550.1 950 22.1

16 1761.5 1809.7 1362 1670.2 1710.8 1081 5.2

17 1533.0 1596.8 881 920.5 973.2 641 40

19 1852.2 1921.1 532 1490.2 1580.3 394 19.5

In order to understand the performance improve-

ment achieved through transfer learning, we analyzed

the progression of ﬁtness values for a problem in-

stance from some benchmark sets. Figure 4 and Fig-

ure 5 shows how the ﬁtness values of the two GAs

evolved over generations for the elected instances

from the datasets.

ECTA 2024 - 16th International Conference on Evolutionary Computation Theory and Applications

372

Table 13: GA vs. GA-FLA with RF for CVRP Christoﬁdes

Data Set.

Instance GA GA-FLA ∆(%)

Best Average Time(secs) Best Average Time(secs)) Best

1 788.1 861.4 129 630.7 660.3 100 20

2 963.7 993.7 288 847.5 864.1 270 12

5 1997.3 2061.4 496 1399.2 1520.8 430 30

6 896.1 1008.7 247 790.7 843.1 220 11.8

7 1097.9 1165.8 304 960.4 970.1 286 12.5

8 1366.4 1475.8 325 911.3 1042.1 301 33.3

11 2124.8 2263.1 367 1224.4 1408.4 343 42.4

12 1204.7 1289.4 352 949.2 1000.5 323 21.2

14 1252.2 1321.1 332 928.9 967.1 318 25.8

Figure 4: Comparison of ﬁtness progression of the GAs for

ETP ITC2007 dataset instance 6.

Figure 5: Comparison of ﬁtness progression of the GAs for

Golden dataset instance 14.

From the two ﬁgures it is evident that the GA-FLA

reaches a better optimum quicker than the GA, i.e.,

the use of the knowledge transfer approach has en-

abled the GA to move to an area of the search space

with better solutions more quickly.

6 CONCLUSION

The main aim of this study was to investigate the

transfer of knowledge in GAs exploring the solu-

tion space. While the concept of knowledge trans-

fer has been effectively employed in GAs exploring

the heuristic, program and design spaces, this has

not been investigated for genetic algorithms explor-

ing the solution space for single objective discrete op-

timization. The reason for this is the challenge of

the solution space consisting of different representa-

tions for different problem instances. In this study

we turned to machine learning to overcome this chal-

lenge. A classiﬁer was trained on data consisting of

ﬁtness landscape measures from the source GA. This

trained classiﬁer was then used in the during the exe-

cution of the target GA to eliminate solutions lead-

ing to local optima from the population. The pro-

posed approach was evaluated for discrete optimisa-

tion on a benchmark set for the ETP and two bench-

mark sets for the CVRP. For all problem instances for

both problems the GA with knowledge transfer (GA-

FLA) was found to outperform the genetic algorithm

without knowledge transfer (GA) with a reduction in

computational cost. The reason for this performance

was that the GA-FLA was also able to move to an area

with better optima quicker than the GA as it avoided

areas likely to lead to poor results. The study also re-

vealed that using an ensemble of classiﬁers, trained

on data from different subsets of generations in the

source GA, was the most effective. Furthermore, RF

were found to perform better than SVM. Overall, the

results showed that the incorporation of knowledge

transfer mechanisms in a GA results in improvements

not only in the quality of solutions obtained, but also

the convergence time.

Future work will explore two main areas: (1) the

incorporation of elitism into the genetic algorithm

to investigate whether preserving the best individuals

enhances the ability to ﬁnd global optima, and (2) the

use of a higher number of generations to assess if fur-

ther improvements can be achieved in both solution

quality and convergence time.

ACKNOWLEDGMENTS

This work was funded as part of the MultiChoice Re-

search Chair in Machine Learning at the University of

Pretoria, South Africa.

This work is based on the research supported in part

by the National Research Foundation of South Africa

(Grant Number 138150 ). Opinions expressed and

conclusions arrived at, are those of the author and are

not necessarily to be attributed to the NRF.

REFERENCES

Ahmed, Z. H., Al-Otaibi, N., Al-Tameem, A., and Sauda-

gar, A. K. J. (2023). Genetic crossover operators for

the capacitated vehicle routing problem. Computers,

Materials & Continua, 75(1).

Improving the Performance of Genetic Algorithms for Combinatorial Optimization Using Machine Learning for Knowledge Transfer

373

Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reute-

mann, P., Seewald, A., and Scuse, D. (2016). Weka

manual for version 3-9-1. University of Waikato:

Hamilton, New Zealand, pages 1–341.

Brandt, H. (2001). Correlation analysis of ﬁtness land-

scapes.

Breiman, L. and Cutler, R. (2001). Random forests machine

learning [j]. journal of clinical microbiology, 2:199–

228.

Cervantes, J., Garcia-Lamont, F., Rodr

ıguez-Mazahua, L.,

and Lopez, A. (2020). A comprehensive survey on

support vector machine classiﬁcation: Applications,

challenges and trends. Neurocomputing, 408:189–

215.

Cortes, C. (1995). Support-vector networks. Machine

Learning.

Daglayan, H. and Karakaya, M. (2016). The impact of

crossover and mutation operators on a ga solution for

the capacitated vehicle routing problem. Universal

Journal of Engineering Science, 4(3):39–44.

Goldberg, D. (1989). Genetic algorithms in search,

optimization and machine learning. reading, ma:

Addison-wesley professional.

Hassan, A. and Pillay, N. (2022). Automated design of

hybrid metaheuristics: A ﬁtness landscape analysis.

In 2022 IEEE Congress on Evolutionary Computation

(CEC), pages 1–8. IEEE.

Huang, J., Wen, J., Chen, L., and Liu, H.-L. (2023). Trans-

fer learning based evolutionary algorithm framework

for multi-objective optimization problems. Applied

Intelligence, pages 1–20.

Jiang, M., Huang, Z., Qiu, L., Huang, W., and Yen, G. G.

(2017). Transfer learning-based dynamic multiobjec-

tive optimization algorithms. IEEE Transactions on

Evolutionary Computation, 22(4):501–514.

Jiang, M., Wang, Z., Hong, H., and Yen, G. G. (2020a).

Knee point-based imbalanced transfer learning for dy-

namic multiobjective optimization. IEEE Transac-

tions on Evolutionary Computation, 25(1):117–129.

Jiang, M., Wang, Z., Qiu, L., Guo, S., Gao, X., and Tan,

K. C. (2020b). A fast dynamic evolutionary multiob-

jective algorithm via manifold transfer learning. IEEE

Transactions on Cybernetics, 51(7):3417–3428.

Liu, Z. and Wang, H. (2021). Improved population pre-

diction strategy for dynamic multi-objective optimiza-

tion algorithms using transfer learning. In 2021 IEEE

Congress on Evolutionary Computation (CEC), pages

103–110. IEEE.

Lu, G., Li, J., and Yao, X. (2011). Fitness-probability cloud

and a measure of problem hardness for evolutionary

algorithms. In Evolutionary Computation in Com-

binatorial Optimization: 11th European Conference,

EvoCOP 2011, Torino, Italy, April 27-29, 2011. Pro-

ceedings 11, pages 108–117. Springer.

Luke, S. (1998). ECJ evolutionary com-

putation library. Available for free at

http://cs.gmu.edu/∼eclab/projects/ecj/.

Lysgaard, J. (1997). Clarke & wright’s savings algorithm.

Department of Management Science and Logistics,

The Aarhus School of Business, 44.

Merkuryeva, G. and Bolshakovs, V. (2011). Benchmark ﬁt-

ness landscape analysis. International Journal of Sim-

ulation Systems, Science and Technology, 12(2):38–

45.

Nyathi, T. and Pillay, N. (2021). On the transfer learn-

ing of genetic programming classiﬁcation algorithms.

In Aranha, C., Mart

ın-Vide, C., and Vega-Rodr

ıguez,

M. A., editors, Theory and Practice of Natural Com-

puting, pages 47–58, Cham. Springer International

Publishing.

Russell, J. and Pillay, N. (2023). A selection hyper-heuristic

for transfer learning in genetic programming. In Pro-

ceedings of the Companion Conference on Genetic

and Evolutionary Computation, GECCO ’23 Com-

panion, pages 631–634, New York, NY, USA. Asso-

ciation for Computing Machinery.

Scheepers, D. and Pillay, N. (2021). A study of transfer

learning in a generation constructive hyper-heuristic

for one dimensional bin packing. In 2021 IEEE Sym-

posium Series on Computational Intelligence (SSCI),

pages 1–7.

Schonlau, M. and Zou, R. Y. (2020). The random forest

algorithm for statistical learning. The Stata Journal,

20(1):3–29.

Sen, P. C., Hajra, M., and Ghosh, M. (2020). Super-

vised classiﬁcation algorithms in machine learning:

A survey and review. In Emerging Technology in

Modelling and Graphics: Proceedings of IEM Graph

2018, pages 99–111. Springer.

Sinaga, K. P. and Yang, M.-S. (2020). Unsupervised k-

means clustering algorithm. IEEE access, 8:80716–

80727.

Stadler, P. F. (2002). Fitness landscapes. In Biologi-

cal evolution and statistical physics, pages 183–204.

Springer.

Sun, Y., Ding, S., Zhang, Z., and Jia, W. (2021). An im-

proved grid search algorithm to optimize svr for pre-

diction. Soft Computing, 25:5633–5644.

Vanneschi, L., Pirola, Y., and Collard, P. (2006). A quanti-

tative study of neutrality in gp boolean landscapes. In

Proceedings of the 8th annual conference on Genetic

and evolutionary computation, pages 895–902.

Wright, S. et al. (1932). The roles of mutation, inbreeding,

crossbreeding, and selection in evolution.

Yadav, S. L. and Sohal, A. (2017). Comparative study

of different selection techniques in genetic algorithm.

International Journal of Engineering, Science and

Mathematics, 6(3):174–180.

Zhang, T.-T., Hao, G.-S., Lim, M.-H., Gu, F., and Wang, X.

(2023). A deep hybrid transfer learning-based evolu-

tionary algorithm and its application in the optimiza-

tion of high-order problems. Soft Computing, pages

1–12.

Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H.,

Xiong, H., and He, Q. (2020). A comprehensive sur-

vey on transfer learning. Proceedings of the IEEE,

109(1):43–76.

Zou, F., Chen, D., Liu, H., Cao, S., Ji, X., and Zhang, Y.

(2022). A survey of ﬁtness landscape analysis for op-

timization. Neurocomputing, 503:129–139.

ECTA 2024 - 16th International Conference on Evolutionary Computation Theory and Applications

374