tion function of the system must be known. To learn
the behaviour of the control function two general ap-
proaches are possible: to derive it analytically from
the available explicit knowledge about the process, or
to learn it by observation. In the latter case, the moni-
tored control system is observed during an interval of
valid operation to produce a learning set L consisting
of n instances s, each instance representing separate
input signals to the MC. This instance is actually a
point in a D-dimensional space, where D is the count
of signals in instance s.
The learning dataset L is a set of instances of
(co-related) signals and MC’s task is to determine
whether a previously unseen instance s belongs to the
space of known valid instances or not (a classifica-
tion/clustering problem). The MC’s limited process-
ing abilities mostly do not support the use of complex
rules for that purpose. The monitoring function can
be created off-line and a sequence of simple compar-
isons, which is sufficient to determine if a point lies
within a faulty region or not, can then be executed by
the MC during system operation.
Assuming the learning set includes all relevant sig-
nals then the simplest general clustering solution is
a hypercube – a generalization of the cube within D
dimensions. The smallest hypercube that includes
all learning instances from L is actually the space S
of all possible states of the control system. Beside
L’s (valid) learning instances, S contains plenty of
other points, most of which are descriptions of faulty
states of the control system. Learning algorithm has
to group (cluster) the valid points into smaller hyper-
cubes with mostly valid points. The learning process
can result either in (1) one, (2) too-many, or (3) few
hypercubes. If (1) then no optimization was per-
formed and all instances are always proclaimed valid.
This is possible if L ∼ S. If (2) the hypercubes are
too small and fail to group related instances. This is
a non-general partitioning of space S over-fitting the
learning set. Such hypercubes fail to classify correctly
most of the valid points not included in L. This is
mainly because no relations between instances exist
or the collected dataset does not include all relevant
signals for fault-detection. Few hypercubes (3) parti-
tion the space S into valid/invalid regions.
In general this is a multi-objective optimization
problem with conflicting goals – many small hyper-
cubes have low classification error yet they fail to
generalize and require more (scarce) computing re-
sources. Depending on hardware MC implementa-
tion, the MC can perform only a limited number
of point-in-hypercube tests. Therefore certain hy-
percubes will include noise (instances of the invalid
type).
2.1 Establishing the Monitoring
Function
From the learning set L the monitoring evaluation
function E must be constructed using a combination
of optimization, unsupervised learning and classifica-
tion techniques. Final solution must be verifiable in
order to be allowed for use in an embedded control
system.
All machine learning techniques are able to ex-
tract information from the learning data. Most of
them (e.g. K-means or minimal spanning tree clus-
tering algorithms, support vector algorithms, neural
networks), however, tend to produce overly complex
solutions unsuitable for MC implementation. The hy-
percubes paradigm suits the MC well and discovery of
hypercubes can be trusted to any evolutionary compu-
tation technique. Although several evolutionary tech-
niques (e.g. genetic algorithms or evolution strategies
(Banzhaf et al., 1998)) can be used, the focus here is
on GP as it is the only evolutionary paradigm capable
of evolving a complete fault-detection logic.
3 USING EC TO DISCOVER
FAULT-DETECTION RULES
The basic difference between GA and GP is that GAs
directly produce solutions according to some pre-
defined structure, while GP produces a self-structured
computer program that effectively calculates a desired
output for particular inputs. GP’s program is therefore
similar in concept to the MC’s monitoring function.
In order to use EC to discover the monitoring cell’s
evaluation function, a description of controller’s be-
haviour is needed. Normal operation of the controller
is described by the empirical learning set L. Genetic
programming could be used as a symbolic regression
tool to discover the symbolic expression that satisfies
the data points in the learning set. However, the hid-
den symbolic expression is probably a complex struc-
ture consisting of numerous mathematical operations
and functions (e.g. trigonometric functions) and is
probably too complex to be implemented in MC.
For monitoring cells the partitioning of the space
of all possible signals into two disjoint spaces S
−
and S
+
is required – space S
+
includes all points
(instances) from the learning set and thus describes
the “normal” states of the controller; S
−
includes no
such points and therefore represents a space of possi-
ble faults (even if system produces signals belonging
to S
−
, the system itself is not necessarily experienc-
ing a fault).
The example used throughout this section is based
on some unknown calculation C using a single scalar
CONSIDERATIONS FOR SELECTING FUNCTIONS AND TERMINALS IN GENETIC PROGRAMMING FOR
FAULT-DETECTION IN EMBEDDED SYSTEMS
143