Supply of Order-1 Building Blocks for Functions Linearly Combined of
Sinusoidal Bases with Integral Frequencies
Hongqiang Mo
1
, Zhong Li
2
and Qiliang Du
1
1
School of Automation Science and Engineering, South China University of Technology, Guangzhou, 510641, P.R. China
2
Faculty of Mathematics and Computer Science, FernUniversitaet in Hagen, Hagen, 58084, Germany
Keywords:
Building Block, Encoding, Genetic Algorithm, Schema Processing, Series Expansion.
Abstract:
In line with the theory of schema sampling, a hypothesis could be made that sufficient supply of low-
order building blocks (BBs) was one of the necessary conditions for a genetic algorithm(GA) to work.
A consequential question of this hypothesis regards, when a certain fitness function is optimized with a
commonly used GA, whether it is rare or common that there are plenty of low-order BBs. It is remarked
that, when a base-m encoded GA is applied to a fitness function that is linearly combined of sinusoidal basis
functions with integral frequencies, it is unlikely to obtain order-1 BBs with fixed positions at multiple loci,
i.e., it is rare that there are plenty of order-1 BBs. However, if a considerable part of the sinusoidal basis
functions are with frequencies exponential to a positive integer m, a base-m encoding can provide relatively
more order-1 BBs compared with the encodings with cardinalities other than m.
1 INTRODUCTION
Genetic algorithms (GAs) have been one of the most
basic forms of evolutionary algorithms since they
were proposed decades ago. Although they have
found wide applications in search, optimization,
design and machine learning, the features of
problems that determine the likelihood of successful
GA performance are not fully understood yet.
One of the typical explanations for the
mechanisms of genetic search is based on the theory
of schema processing, in which the feasible solutions
of a problem are represented with strings, and the
searching for the optimal or sub-optimal strings are
believed to be implicitly implemented by
recombining highly fit, low-order schemata
(Goldberg, 1989)(Rothlauf, 2006). A schema is a
template that identifies a subset of strings with
similarities at certain string positions, and a single
string belongs to all the schemata in which any of its
fixed positions appear. For example, the strings 1011
and 1001 are members of schemata 10** (where the
*s stand for unspecified positions), 1**1, *0**, and
so forth. The order of a schema refers to the number
of its fixed positions. For example, 10** and 1**1
are both order-2 schemata, and the order of *0** is 1.
In line with the theory of schema sampling, a schema
can be regarded as a particular region in the solution
space, and the schemata containing many unspecified
positions — the low-order schemata — will typically
be sampled by a large fraction of all the strings in a
population of a GA. And by manipulating a limited
population of strings, a GA actually samples a vastly
larger number of regions (Goldberg, 1989)(Rothlauf,
2006)(Holland, 1975). As stated by the schema
theorem, successive generations of reproduction
produce increasing numbers of trials that lie in the
regions represented by highly fit, low-order schemata
(Goldberg, 1989)(Holland, 1975). And it is assumed
that, when these highly fit, low-order schemata
recombine to form even more highly fit, higher-order
schemata, a GA rapidly focuses its attention on the
most promising parts of the solution space; in this
sense, the highly fit, low-order schemata are also
called building blocks (BBs) (Goldberg,
1989)(Goldberg, 2002).
If a GA indeed functions in this way, a hypothesis
can be made that sufficient supply of low-order BBs
is one of the necessary conditions for it to work
(Goldberg, 1989)(Rothlauf, 2006)(Holland, 1975).
And a consequential question of this hypothesis
regards, when a certain fitness function is optimized
with a commonly used GA, whether it is rare or
common that there are plenty of low-order BBs.
In this paper, we will make an attempt to study
this issue by analyzing the cases of fitness functions
Mo, H., Li, Z. and Du, Q..
Supply of Order-1 Building Blocks for Functions Linearly Combined of Sinusoidal Bases with Integral Frequencies.
In Proceedings of the 7th International Joint Conference on Computational Intelligence (IJCCI 2015) - Volume 1: ECTA, pages 133-137
ISBN: 978-989-758-157-1
Copyright
c
2015 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
133
linearly combined of sinusoidal basis functions with
integral frequencies, and by finding out whether
there are always plenty of order-1 BBs when such a
fitness function is optimized with a commonly used
GA. The choice of such fitness functions is mainly
inspired by Fourier transformation. Linear expansion
of a function into sine functions conforms to the
common practices in functional analysis. By doing
so, we hope that the future analysis of encoding
design and GA hardness can get more supports from
functional theory.
The discussions will be limited to the commonly
used encodings. In the applications of GAs to
continuous optimization problems, the feasible
solutions are usually expressed with binary
encodings. In this paper, we will extend them to
base-m encodings a more general form of
representations where m is an integer larger than
1. When a solution x, also called string or individual,
is represented with the base-m encoding of string
length l, x =
l
h=1
x
h
m
hl1
, where
x
h
{0,··· ,m 1}. The commonly used binary
representations are base-2 encodings. And for the
base-3 encoding of string length 12,
x =
12
h=1
x
h
3
h13
, and x
h
{0,1, 2} .
There have been models to estimate the
population size required to guarantee the presence of
all raw BBs in a GA (Goldberg et al., 2001)
population or a genetic-programming (Sastry et al.,
2003) population, but the models only established
necessary population size for building-block supply,
and did not tell about whether or not there were
indeed low-order BBs. The previous work of this
paper has investigated the supply of order-1 BBs for
fitness functions that were linearly combined of
sinusoidal basis functions with frequencies
exponential to a positive integer (Mo et al.,
2009)(Mo et al., 2015). In this paper, we will extend
the discussions to sinusoidal basis functions with
integral frequencies, and will focus on the existence
of order-1 BBs.
The rest of this paper is organized as follows:
Section 2 introduces an index to the existence of an
order-1 BB at a certain locus. Section 3 explains why
it is unlikely to generate order-1 BBs simultaneously
at multiple loci when a base-m encoding is used to
express the fitness functions linearly combined of
sinusoidal basis functions with arbitrary integral
frequencies, and then proposes an encoding
suggestion for the cases that the frequencies of a
considerable part of the sinusoidal basis functions are
exponential to a positive integer m. Finally, Section 4
summarizes the paper.
2 INDEX TO THE EXISTENCE OF
ORDER-1 BUILDING BLOCK
Specially, the fitness functions discussed herein are
G(x) =
n
B
i=1
a
i
sin(2πp
i
x + ϕ
i
) + c, where n
B
and p
i
are positive integers, a
i
, ϕ
i
, and c are real numbers,
a
i
0, ϕ
i
[0, 2π), and c is large enough to ensure
G(x) 0. When x is unlimited, G(x) is periodical,
and there is at least one complete cycle of G(x)
within [0, 1). Therefore, without loss of generality,
the discussions are restricted within x [0, 1). And
for the sake of convenience, we do not distinguish
between the fitness of string x
l
···x
1
, which is
denoted as G(x
l
···x
1
), and that of its decoded value,
x, i.e., G(x
l
···x
1
) = G(x).
A schema is said to match an individual if they
are identical at the fixed positions of the former. The
fitness of a schema can be defined as the average
fitness of all the individuals matched by the schema
in a certain population or in the whole search space.
Let’s take the base-3 encoding of length 2 as an
example: Under the former definition, given a
population consisting of 01, 01,22,10, the fitness of
*1 is equal to (G(01) + G(01))/2; Under the latter
definition, the fitness of schema *1 is equal to
(G(01) + G(11) + G(21))/3 regardless of population
members. With the former definition, schema fitness
is dynamic during evolution, and its value depends
not only on encoding, but also on the formation of
initial population, genetic operators, and selection
strategy. The latter, usually used to determine the
static fitness distributions of a schema (Goldberg,
1989)(Whitley et al., 2003), is especially suitable to
study the sole effect of encoding on schema fitness.
Therefore, here and throughout, the latter definition
is adopted.
In this paper, the fitness are compared among the
order-1 schemata with the same fixed positions. For
convenience, if the fixed position of an order-1
schema is at the h-th position of the string counted
from the rightmost, we call it an order-1 schema at
locus h, where h is a positive integer no larger than
string length l. For example, 1***, **0* are order-1
schemata at the 4th and 2nd loci, respectively.
When a G(x) is expressed with a base-m encoding
of string length l, where l is much larger than n
B
, the
fitness of order-1 schema ··· x
h
··· at locus h,
denoted as f
G
(x
h
), is
f
G
(x
h
) =
m
lh
1
k=0
m
h1
1
o=0
G((km+ x
h
)m
hl1
+ om
l
)
m
l1
,
(1)
where the symbols f and x
h
stand for average
ECTA 2015 - 7th International Conference on Evolutionary Computation Theory and Applications
134
fitness and position value (allele value), respectively,
and the subscript G indicates the name of the fitness
function. The maximal fitness of the order-1
schemata at locus h is denoted as max
h
( f
G
(x
h
)).
The average fitness of all the order-1 schemata at
locus h is
f
G
() =
m1
x
h
=0
f
G
(x
h
)
m
. (2)
In line with the theory of schema processing, a GA
implicitly attempts to allocate trials to different
regions of the search space based on schema fitness.
If f
G
(x
h
) > f
G
() at locus h, which means that
· ·· x
h
·· · is fitter than the average fitness of all
the order-1 schemata at locus h, the schema will have
a chance higher than average to reproduce more
samples in the subsequent generation. Consulting the
definition given in (Goldberg, 1989), we have
Definition 1 for order-1 BB.
Definition 1. Order-1 BB at locus h: An order-1
schema at locus h that fitter than the average fitness of
all the order-1 schemata at this locus.
Therefore, f
G
(x
h
) > f
G
() indicates that ···x
h
··· is an order-1 BB at locus h.
3 SUPPLY OF ORDER-1 BBS FOR
G(X) EXPRESSED WITH A
BASE-M ENCODING
To simplify expressions, let’s denote sin(2πp
i
x+ ϕ
i
)
as B
i
(x), and m
hl1
as , respectively. By definition
(1) and (2), the values of f
G
(x
h
) and f
G
() are equal to
the weighted sums of f
B
i
(x
h
) and f
B
i
(), respectively,
as shown in (3) and (4), respectively.
f
G
(x
h
) =
n
B
i=1
a
i
f
B
i
(x
h
) + c, (3)
and
f
G
() =
n
B
i=1
a
i
f
B
i
() + c. (4)
As illustrated in Fig.1, when m
l
0,
m
h1
1
o=0
B
i
((km+ x
h
)+ om
l
)
m
l
Z
(km+x
h
+1)
(km+x
h
)
B
i
(x)dx,
(5)
where h {1,· ·· ,l}. Applying sum-to-product
trigonometric formulas, one has
Z
(mk+x
h
+1)
(mk+x
h
)
B
i
(x)dx =
sin(πp
i
)sin(2πp
i
(mk+ x
h
) + πp
i
+ ϕ
i
)
πp
i
. (6)
x
x
l
mx
l
mx
2
)(xB
i
)(xB
i
)(xB
i
Figure 1: The illustration of an approximation of
R
¯x+m
l
¯x
B
i
(x)dx with B
i
( ¯x), where ¯x = (km+ x
h
) + om
l
.
When m
l
0,
R
¯x+m
l
¯x
B
i
(x)dx m
l
B
i
( ¯x).
According to (1), (5), and (6),
f
B
i
(x
h
)
m
lh
1
k=0
msin(πp
i
)sin(πp
i
(2mk+ 2x
h
+ 1) + ϕ
i
)
πp
i
.
(7)
Further denote msin(πp
i
)/(πp
i
) as A
i
and
πp
i
(2x
h
+ 1) + ϕ
i
as
¯
ϕ
i
, respectively, we have
f
B
i
(x
h
)
m
lh
1
k=0
A
i
sin(2kp
i
πm
hl
+
¯
ϕ
i
). (8)
When p
i
6= m
lh
and h < l, the amplitudes of the
m
lh
sinusoidal functions in (8), i.e.,
A
i
sin(2kp
i
πm
hl
+
¯
ϕ
i
), k {0,··· ,m
lh
1}, are
the same, and the phase difference between each
adjacent pair of the functions is 2p
i
πm
hl
.
Therefore, their sum is equal to 0, and f
B
i
(x
h
) 0.
By a straightforward derivation of (8), we have
f
B
i
(x
h
)
msin
πp
i
m
sin(
πp
i
(2x
h
+1)
m
+ϕ
i
)
πp
i
, if h = l;
msin
π
m
sin(
π(2x
h
+1)
m
+ϕ
i
)
π
, if p
i
= m
lh
;
0, if p
i
6= m
lh
and h < l.
(9)
Once again, when h = l, the amplitudes of the m
sinusoidal functions in (9) are the same, and the
phase difference between each adjacent pair of the
functions is 2p
i
π/m. As a result, their sum is equal
to 0. The same thing happens when p
i
= m
lh
. Thus,
in line with (2) and (9), we have
f
B
i
() 0, (10)
regardless of the values of h, p
i
, and m. Substituting
(10) into (4), we have
f
G
() c. (11)
Supply of Order-1 Building Blocks for Functions Linearly Combined of Sinusoidal Bases with Integral Frequencies
135
Remark 1. If p
i
is randomly assigned with a
positive integer, the probability that the value
happens to be m
lh
is 0. Therefore, according to
(9), for most of the sinusoidal basis functions
B
i
(x) = sin(2πp
i
x + ϕ
i
), f
B
i
(x
h
) 0 at locus
h < l. Substituting f
B
i
(x
h
) 0 into (3), and
according to (11), we have f
G
(x
h
) c f
G
().
Thus, the fitness of all the order-1 schemata at
loci h < l are almost the same regardless of the
allele value of x
h
, and no fitness differences
among the order-1 schemata can be achieved at
these loci. In a word, it is unlikely to obtain
order-1 BBs at loci h < l when a base-m
encoding is used to express
G(x) =
n
B
i=1
a
i
sin(2πp
i
x+ ϕ
i
) + c.
It should be noted that the above-mentioned
conclusion is independent of the value of the
encoding base. It applies to all base-m encodings,
including the most commonly used binary ones.
And for such kind of fitness functions, no choice
of encoding cardinality offers intrinsic advantage
over another on the supply of order-1 BBs.
Remark 2. According to (9), when a
considerable part of the sinusoidal basis
functions of a fitness function G(x) are with
frequencies exponential to m, expressing the
fitness function with a base-m encoding can
result in order-1 BBs simultaneously at multiple
loci. Substituting p
i
= m
lh
i
into (9), we have
f
B
i
(x
h
i
) =
msin(π/m)sin(π(2x
h
i
+ 1)/m+ ϕ
i
)/π. When x
h
i
increases from 0 to m 1, there will be at least
one locus setting, x
h
i
= x
h
i
, that satisfies
f
B
i
(x
h
i
) > 0. In line with (3) and (4),
f
G
(x
h
i
) > f
G
(). In other words, ··· x
h
i
···
is an order-1 BB at this locus. Therefore, for the
fitness functions defined as in (12), if a
i
is
significantly larger than 0, and if a base-m
encoding is used, there will be an order-1 BB at
locus h
i
, where i {1, ··· , n
1
}.
G(x) =
n
1
i=1
a
i
sin(2πm
lh
i
x+ ϕ
i
)
+
n
2
j=1
a
n
1
+ j
sin(2πp
j
x+ ϕ
n
1
+ j
) + c, (12)
where n
1
+ n
2
= n
B
.
According to the schema theory, more than
average samples would be allocated to the
regions represented by the order-1 BBs during
evolution, and the search would soon be guided
to the promising regions. Therefore, it can be
expected that, for such kind of fitness functions,
the GAs with the right encoding cardinality
outperform those with other cardinalities. Thus,
it is suggested to adopt a base-m encoding to
express the kind of fitness functions defined in
(12).
4 FURTHER DISCUSSIONS AND
CONCLUSIONS
This paper discussed the supply of order-1 BBs for
the fitness functions that were linearly combined of
sinusoidal basis functions with integral frequencies,
n
B
i=1
a
i
sin(2πp
i
x+ ϕ
i
) + c. It was remarked that, if
the positive integers p
i
s were randomly chosen, and
if a base-m encoded GA was used, it was unlikely to
obtain order-1 BBs at loci h < l, no matter what the
value of m was. Therefore, for this kind of fitness
functions, no cardinality of encoding could exhibit
advantage over other choices on the supply of
order-1 BBs. The results to some degree supported
the known facts that no representation should be
superior for all classes of problems (Fogel and
Ghozeil, 1997)(Wolpert and Macready,
1997)(Whitley, 1999).
However, things would change if one was
restricted to consider a special subclass of these
fitness functions, in which the frequencies of a
considerable part of the sinusoid basis functions were
exponential to a positive integer m. It was proved
that, for a fitness function in this subclass, a base-m
encoding could provide relatively more order-1 BBs
compared with the encodings with cardinalities other
than m.
It should be noted that the discussions in this
paper have been focused on order-1 BBs. That an
encoding can not achieve fitness differences among
order-1 schemata does not imply that it can not gain
them among higher-order ones. However, it is
necessary to employ a population of relatively large
size if one tries to gain sufficient supply of
high-order BBs. And also, it iss not a trivial matter to
dig them out. Actually, plenty of skillful memetics
algorithms (Goldberg, 2002)(Goldberg et al.,
2003)(Chen and Lim., 2009)(Krasnogor and Smith,
2005) (Chen et al., 2011) have been established to
facilitate the formation of deep BBs and discover
them.
It should also be noted that the idea that GAs
search by schema sampling has received many
different criticisms: Increasing the sampling rate of
schemata that are above average compared to other
competing schemata does not guarantee convergence
to a global optimum, since the search may be misled
to wrong directions due to deception (Goldberg,
ECTA 2015 - 7th International Conference on Evolutionary Computation Theory and Applications
136
1992)(Deb and Goldberg, 1994)(Deb et al., 1993),
hyper-plane inconsistency (Whitley et al.,
1995)(Whitley et al., 2003), synchronization
(Hoyweghen et al., 2001), sampling errors
(Goldberg, 1989)(Forrest and Mitchell, 1993), etc. It
is accepted that the notion of using schema
information to guide search at best be viewed as a
heuristic (Whitley, 2001).
We will expand the analyses to higher-order BBs,
and seek for more explanations for genetic behaviors
in future study. Meanwhile, we will also explore the
possibility of extending the analysis to Gray-coded
and real-coded GAs in the future.
ACKNOWLEDGEMENTS
This work was supported by “National Natural
Science Foundation of China, 61105062 and
61305038”, “Fundamental Research Funds for the
Central Universities, SCUT, 2012ZZ0106 and
2014ZZ0045”.
REFERENCES
Chen, X., Ong, Y.-S., Lim, M.-H., and Tan, K. C. (2011).
A multi-facet survey on memetic computation.
IEEE Transactions on Evolutionary Computation,
15(5):591–607.
Chen, Y. and Lim., M. (2009). Linkage in Evolutionary
Computation (Studies in Computational Intelligence).
Springer Verlag.
Deb, K. and Goldberg, D. E. (1994). Sufficient conditions
for deceptive and easy binary functions. Annals of
Mathematics and Artificial Intelligence, 10(4):385–
408.
Deb, K., Horn, J., and Goldberg, D. (1993). Multimodal
deceptive functions. Complex Systems, 7(2):131–153.
Fogel, D. B. and Ghozeil, A. (1997). A note on represen-
tations and variation operators. IEEE Transactions on
Evolutionary Computation, 1(2):159–161.
Forrest, S. and Mitchell, M. (1993). What makes a problem
hard for a genetic algorithm? some anomalous results
and their explanation. Machine Learning, 13(2-
3):285–319.
Goldberg, D. E. (1989). Genetic Algorithms in Search,
Optimization and Machine Learning. Addison-
Wesley.
Goldberg, D. E. (1992). Construction of high-order de-
ceptive functions using low-order walsh coefficients.
Annals of Mathematics and Artificial Intelligence,
5(1):35–48.
Goldberg, D. E. (2002). Design of Innovation: Lessons
from and for Competent Genetic Algorithms. Kluwer
Academic Publishers.
Goldberg, D. E., Sastry, K., and Latoza, T. (2001). On the
supply of building blocks. In Spector, L., editor, Proc.
of the 2001 Genetic and Evolutionary Computation
Conference, pages 336–342. San Francisco, Calif.,
Kaufmann.
Goldberg, D. E., Sastry, K., and Ohsawa, Y. (2003). Dis-
covering deep building blocks for competent genetic
algorithms using chance discovery via keygraphs.
In Ohsawa, Y. and Mcburney, P., editors, Chance
Discovery, pages 276–301. Springer-Verlag.
Holland, J. H. (1975). Adaptation in natural and artificial
systems: An introductory analysis with applications
to biology, control, and artificial intelligence. U
Michigan Press.
Hoyweghen, C. V., Goldberg, D. E., and Naudts, B.
(2001). Building block superiority, multimodality
and synchronization problems. In Proceedings of the
Genetic and Evolutionary Computation Conference
(GECCO-2001), pages 694–701. Morgan Kaufmann.
Krasnogor, N. and Smith, J. (2005). A tutorial for
competent memetic algorithms: Model, taxonomy,
and design issues. IEEE Transactions on Evolutionary
Computation, 9(5):474–488.
Mo, H. Q., Li, Z., Park, J. B., and Joo, Y. H. (2009). On the
supply of superior order-1 building blocks for a class
of periodical fitness functions. International Journal
of Computational Intelligence Systems, 2(1):91–98.
Mo, H. Q., Li, Z., Tian, L. F., and Tian, X. (2015).
Selection of encoding cardinality for a class of
fitness functions to obtain order-1 building blocks.
International Journal of Computational Intelligence
Systems, 8(1):62–74.
Rothlauf, F. (2006). Representations for genetic and
evolutionary algorithms. Springer.
Sastry, K., O’Reilly, U.-M., Goldberg, D. E., and Hill,
D. (2003). Building block supply in genetic
programming. In Riolo, R. L. and Worzel, B., editors,
Genetic Programming Theory and Practice, chapter 9,
pages 137–154. Kluwer.
Whitley, D. (1999). A free lunch proof for gray versus
binary encodings. In Proceedings of the Genetic
and Evolutionary Computation Conference, volume 1,
pages 726–733. Citeseer.
Whitley, D. (2001). An overview of evolutionary
algorithms: Practical issues and common pitfalls.
Information and software technology, 43(14):817–
831.
Whitley, D., Heckendorn, R. B., and Stevens, S. (2003).
Hyperplane ranking, nonlinearity and the simple
genetic algorithm. Information Sciences, 156(3):123–
145.
Whitley, D., Mathias, K. E., and Pyeatt, L. D. (1995).
Hyperplane ranking in simple genetic algorithms.
In Eshelman, L., editor, Proceedings of the 6th
International Conference on Genetic Algorithms,
pages 231–238.
Wolpert, D. H. and Macready, W. G. (1997). No free
lunch theorems for optimization. IEEE Transactions
on Evolutionary Computation, 1(1):67–82.
Supply of Order-1 Building Blocks for Functions Linearly Combined of Sinusoidal Bases with Integral Frequencies
137