Partial Sampling Operator and Tree-structural Distance
for Multi-objective Genetic Programming
Makoto Ohki
Field of Technology, Tottori University, 4, 101 Koyama-Minami, Tottori, 680-8552, Japan
Keywords:
Genetic Programming, Multi-objective Optimization, Partial Sampling, Tree Structural Distance, NSGA-II.
Abstract:
This paper describes a technique on an optimization of tree-structure data, or genetic programming (GP), by
means of a multi-objective optimization technique. NSGA-II is applied as a frame work of the multi-objective
optimization. GP wreaks bloat of the tree structure as one of the major problem. The cause of bloat is that the
tree structure obtained by the crossover operator grows bigger and bigger but its evaluation does not improve.
To avoid the risk of bloat, a partial sampling (PS) operator is proposed instead to the crossover operator.
Repeating processes of proliferation and metastasis in PS operator, new tree structure is generated as a new
individual. Moreover, the size of the tree and a tree-structural distance (TSD) are additionally introduced into
the measure of the tree-structure data as the objective functions. And then, the optimization problem of the
tree-structure data is defined as a three-objective optimization problem. TSD is also applied to the selection
of parent individuals instead to the crowding distance of the conventional NSGA-II. The effectiveness of the
proposed techniques is verified by applying to the double spiral problem.
1 INTRODUCTION
A technique of genetic programming (GP)(Koza,
1992; Koza, 1994) is an algorithm to optimize struc-
tured data based on a genetic algorithm(Goldberg,
1989; Mitchell et al., 1996). GP is applied to various
fields such as program synthesis(David and Kroening,
2017), function generations(Jamali et al., 2017) and
rule set discoveries(Ohmoto et al., 2013). Although
GP is very effective for optimizing structured data,
it has several problems such as getting into a bloat,
inadequate optimization of constant nodes, being eas-
ily captured in local optimal solution area when ap-
plied to complicated problems. The main cause of
bloat is a crossover operator which exchanges par-
tial trees of parent individuals(Nordin et al., 1995;
Angeline, 1997; Angeline, 1998; De Jong et al.,
2001), where this paper focuses on the optimization
of tree-structure data by means of GP. Several tech-
niques to reduce the bloat have been proposed by im-
proving the simple crossover operation(Koza, 1994;
De Bonet et al., 1997; M
¨
uhlenbein and Paass, 1996;
Ito et al., 1998; Langdon, 1999; Francone et al.,
1999). Although these methods have successfully in-
hibited bloat to a certain extent, effective search has
not necessarily been performed. Moreover, there is
no theoretical basis that crossover is effective for op-
timizing the tree-structure data.
Apart from reduction of the bloat, a search method
for optimizing the graph structure has been pro-
posed(Karger, 1995). Although this method is suit-
able for searching various structural data consisting
of nodes and branches, the algorithm is complicated
and the computation cost is high.
In this paper, we exclude the crossover operator
which is the cause of bloat in GP, and propose a par-
tial sampling (PS) operator as a new operator for GP
instead. In PS operator, first of all, a partial sample
of a partial tree structure is extracted from several in-
dividuals of a parent individual group by a prolifera-
tion. Next, the partial tree structure obtained by the
proliferation is combined with a new tree structure by
a metastasis. In this paper, two types of metastasis
are prepared for GP, one that depends on the origi-
nal upper node and the other that does not. Repeating
the proliferation and the metastasis regenerates a new
tree-structure data for the next generation.
In addition, in this paper, MOEA technique for
suppressing bloat and acquire many kinds of various
tree-structure data is applied for GP by adding the size
and the distance of the tree-structure data to the eval-
uation. One of the newly added objective functions
is the size of the tree-structure data. Furthermore, the
relative position of the target individual in the popu-
lation in terms of the tree-structural distance (TSD)
is also evaluated as an objective function. The tree-
110
Ohki, M.
Partial Sampling Operator and Tree-structural Distance for Multi-objective Genetic Programming.
DOI: 10.5220/0006894401100117
In Proceedings of the 10th International Joint Conference on Computational Intelligence (IJCCI 2018), pages 110-117
ISBN: 978-989-758-327-8
Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved