problem (like for instance classification or regres-
sion), thus favoring GP evolvability. Also thanks to
an efficient implementation of GSGP that was de-
fined in 2013 (Vanneschi et al., 2013; Castelli et al.,
2015a), it was possible to successfully apply GSGP
to several different complex real-life applications (see
for instance (Castelli et al., 2014, 2015b, 2013a)).
However, GSGP has at least the following recog-
nized drawbacks: (1) GSOs generate individuals that
are larger than their parents, and this causes a rapid
growth in the size of the individuals in the popula-
tion; (2) GSXO was shown to be quite ineffective on
a large set of applications.
The former problem is widely discussed in literature.
The implementation proposed in (Vanneschi et al.,
2013; Castelli et al., 2015a) is a workaround to this
problem, in the sense that, although not limiting the
code growth, it makes the system not only usable in
practice, but even more efficient than standard GP.
This paper focuses on the latter drawback, already
pointed out in the literature for instance in (Moraglio
and Mambrini, 2013), where a purely mutation-
based GSGP was proposed, after recognizing the use-
lessness of GSXO. We believe that one of the reasons
for the poor performance of GSXO lies in its geomet-
ric property. In fact, as we said above, GSXO gener-
ates an offspring whose semantics stands in the seg-
ment joining the semantics of the parents. In this per-
spective, if we imagine a GP population as a cloud
of points in the semantic space, we could informally
say that crossover is only able to generate points that
are “inside” the cloud. So, if the target (that is also a
known point in the semantic space) is not contained
inside the cloud, GSXO will never be able to gener-
ate it. Also, if the target is quite far from the cloud,
GSXO will not be even able to reasonably approxi-
mate it.
The main objective of this paper is to confirm this
hypothesis by means of a set of experiments. For
achieving this objective, we need a formal tool that
allows us to “capture” our idea of cloud of individu-
als in the semantic space. More specifically, it would
be useful to have a formal method to indicate what we
could informally call the “border” of a cloud. In this
way, we could use this tool both for understanding if a
given point is “inside” or “outside” the cloud and for
calculating the distance from one point to the cloud.
Contributions of this paper are: (1) Introduction of
the concept of convex hull, as a tool to represent the
“border” of a set of points in the semantic space. (2)
Introduction of a method to understand if a point is
contained in the convex hull or not. (3) Introduction
of a method to calculate the distance from a point to
the convex hull.
The first contribution has already been considered
in (Moraglio, 2011), where authors showed that all the
evolutionary algorithms using geometric crossover
with no mutation perform the same form of convex
search regardless of the underlying representation, the
specific selection mechanism, the specific offspring
distribution, the specific search space, and the prob-
lem at hand.
With the contributions provided in our study, we
are able to monitor the convex hull of the points rep-
resenting the semantics of all the individuals in the
population during the GP evolution. In particular, we
able to study the evolution of the distance from the
target to the convex hull during a GP run.
In this paper, we compare two GSGP systems: the
first one uses both GSXO and GSM; the second one
uses only GSXO. The different behaviour of the lat-
ter, compared to the first, should allow us to shade a
light on the limitations of GSXO. As test cases for
this experimental study, we have decided to use four
real-life symbolic regression problems from the UCI
repository (Lichman, 2013).
2 GEOMETRIC SEMANTIC
OPERATORS
GSOs are becoming more and more popular in the
GP community (Vanneschi et al., 2014a), probably
because of their property of inducing a unimodal fit-
ness landscape on any problem consisting in matching
sets of input data into known targets (like for instance
supervised learning problems, such as regression and
classification). To have an intuition of this property
(whose proof can be found in (Moraglio et al., 2012)),
let us first consider a Genetic Algorithms (GAs) prob-
lem in which the unique global optimum is known and
the fitness of each individual (to be minimized) cor-
responds to its distance to the global optimum (our
reasoning holds for any employed distance). In this
problem, if we use, for instance, ball mutation (Kraw-
iec and Lichocki, 2009) (i.e. a variation operator that
slightly perturbs some of the coordinates of a solu-
tion), then any possible individual different from the
global optimum has at least one fitter neighbor (indi-
vidual resulting from its mutation). So, there are no
local optima. In other words, the fitness landscape is
unimodal, and consequently the problem is character-
ized by a good evolvability. Similar considerations
hold for many types of crossover, including various
kinds of geometric crossover (Krawiec and Lichocki,
2009).
Now, let us consider the typical GP problem of
finding a function that maps sets of input data into
ECTA 2016 - 8th International Conference on Evolutionary Computation Theory and Applications
202