event, we summed up the vectors of the parents to ob-
tain a new vector that can, in some senses, be inter-
preted as the “fingerprint” of the child. In this way,
at the end of the evolutionary process, we are able to
quantify the contribution of each BL and of GP itself
to the construction of the final solution. Theoretically,
an individual that has a value equal to the number of
crossover events in the position of the vector corre-
sponding to GP, and a value equal to 0 is all other
positions is a model that has been generated purely
by GP, without any contribution from any of the BLs.
On the other hand, the values in the positions corre-
sponding to the BLs quantify the contribution of the
various BLs in the formation of that individual. The
results of this analysis are reported in Figure 2 and
in Figure 3, where the values contained in the vec-
tor associated to the best individual at the end of the
evolution are reported, after having been normalized
in a [0, 1] scale. Figure 2 reports the results for UGP,
while Figure 3 shows the analogous results for UGP-
SEL.
The first important observation is that none of
the best-evolved models (neither the ones evolved by
UGP nor the ones evolved by UGP-SEL) is a purely
GP-evolved individual. All of them have an impor-
tant contribution by the BLs. Furthermore, only in
one case, the final model is formed by a 100% contri-
bution from only one BL. It is the case of UGP-SEL
in the Energy dataset, reported in Figure 3(a), where
the final model was formed by a 100% contribution of
SVR. In all other cases, the final model is formed by
a blend between different BLs and GP. Before contin-
uing with the analysis of Figures 2 and 3, it is impor-
tant to point out that if an individual is formed by a
100% contribution of one BL, this does not necessar-
ily mean that the final model was the one generated by
that BL, and used to initialize the GP population. In-
deed, in GP it is possible to have a crossover between
one individual and itself, and when GSOs are used
(as in this work), the offspring is an individual that is
very different from the parent. Indeed, what happened
with UGP-SEL in the case of the Energy dataset (Fig-
ure 3(a)) is that the final model was obtained by vari-
ous crossovers between individuals that, although dif-
ferent between each other, are all descendants from
the model generated by SVR. This is proven by the
fact that the curve of the evolution of UGP-SEL for
this problem is not constant (see Figure 1(a)), but the
error is steadily decreasing along with the evolution.
Thus, the final model is not the one generated by SVR
and used in the initialization, but an improvement of
it.
Analyzing Figures 2 and 3 more in detail, one
can remark that on the Energy dataset the final model
was only influenced by SVR in the case of UGP-SEL
(Figure 3(a)), while the best solution is obtained by
a blend of LR and SVR in the case of UGP (Fig-
ure 2(a)). In the latter case, the model consists of a
contribution of 98% given by LR and the remaining
2% by SVR. Looking back at Figure 1, it is interesting
to notice that while UGP and UGP-SEL performed
in a comparable manner, the best solutions they pro-
duced have been formed by combining different BLs.
This observation further corroborates the importance
of semantics: despite the solutions having completely
different structures, because they are formed by re-
combinations of models generated by different BLs,
they have a similar behaviour, that is what matters for
assessing the models’ performance. Taking into ac-
count the Concrete dataset, it is possible to see that the
final model of UGP (Figure 2(b)) was generated using
a contribution of approximately 40% of GP, 35% of
LR, 20% of MLP, and 5% of SVR. On the same prob-
lem UGP-SEL (Figure 2(c)) behaves similarly but,
with respect to UGP, a more important contribution
to the construction of the best model is given by GP
and LR, while MLP and SVR decreased their con-
tribution. Finally, on the Parkinson benchmark, we
have a similar situation where both UGP and UGP-
SEL built a final model with a contribution of approx-
imately 55% of SVR, 30% of GP, 10% of LR, and 5%
of MLP (Figures 2(e) and 2(f)).
To conclude the experimental study, we focused
on UGP-SEL and we analyzed the number of times in
which the model generated by a BL was selected to
be inserted in the initial population, using the selec-
tion process described in Section 3.1. This analysis is
reported in Figure 4.
Remembering that each one of the independent
runs that we have performed uses a different train-
ing/test partition of the data, and thus also the ini-
tial models generated by the BLs are different be-
tween each other in the different runs, we can ob-
serve that for the Energy problem (Figure 4(a)), in
more than 60% of the runs both the models generated
by LR and SVR are inserted in the initial population,
while in the remaining runs only the model generated
by SVR was selected and inserted in the initial popu-
lation. On the Concrete dataset (Figure 4(b)), in more
than 40% of the runs all the studied BL models (SVR,
LR and MLP) were selected to be part of the initial
population, and in more than 30% of the runs the
models that were selected and inserted in the initial
population were the ones generated by LR and MLP.
Looking back at the results of Figure 3, we can see
that the evolutionary process is able to combine the
solutions composed by the aforementioned BLs, im-
proving them; but, despite that, the final model was
ECTA 2019 - 11th International Conference on Evolutionary Computation Theory and Applications
120