– Reproduction of the individuals by applying ge-
netic operators.
• Until an acceptable solution is found or a stopping
condition is met.
• Return the best individual.
Particularly, the fitness function determine the
quality of the individuals (equations) to follows the
behavior (description) of the system.
3 APPROACH OF SYSTEM
IDENTIFICATION BASED ON
GENETIC PROGRAMMING
Based on section 2.2, it can be deduced that for sys-
tem identification the common approach is to assume
an initial structure, then, to find the value of the con-
stants in it using the curve of the output data. How-
ever, there are cases where there is not previous in-
formation about the type of the system, and therefore,
it is complicated to know the structure and order of
it. As an alternative, we propose the use of the ar-
tificial intelligence in order to determine the model
of the system, where there is not need of previous
information about its characteristics. The use of the
artificial intelligence for system identification is not
new, it has been applied before by different research
groups (Chen et al., 1990), (Cerrada and Aguilar,
2001), (Patelli, 2011), (Madar et al., 2004), (Samy
et al., ), among others.
The decision of using genetic programing is based
in the fact that this technique does not provide unique
responses, instead it generates a population of solu-
tions which can fit the model of the system. This fea-
ture gives to the researcher the possibility of choos-
ing, a determined structure, according to the level of
complexity or the precision of the model required by
the context of the application.
In this work, we propose the implementation of
two different methods for the system identification.
The first one, uses information about the inputs and
the outputs of the system, which are used to establish
the relationships between them in the time domain,
which can be associated to a non linear system.
The second proposed method helps to determine the
relationship between inputs and outputs, through
the transfer function of the system in the frequency
domain, which is necessarily associated to a linear
system.
The Genetic Programming tool used is the GP-
TIPS V2 Toolbox for MATLAB, developed by Do-
minic Searson. It carries out a multi gene symbolic
regression, with input-output data. Each symbolic
model obtained (a member of the GP population) is a
weighted equation that is a combination of the inputs
of the system (Searson, 2015), plus a bias term. The
weights are calculated with an ordinary least squares
technique.
The tool uses an elitist selection mode, where the
best individuals of the population are chosen to be-
come the parents of new ones, or to be the final mod-
els, depending of the least Root Mean Square Er-
ror (RMSE). However, this criteria presents a poten-
tial disadvantage, the models obtained might be too
complex. In other words, the mathematical functions
within the models could be very intricate, i.e. mul-
tiple nested functions, which represents a difficulty
when giving a physical interpretation of the system,
and subsequently, greater difficulty in the design of a
controller. To solve this, the depth of the trees created
by the GP and the number of genes (trees) have to be
limited and the chosen mathematical functions must
define simple models structures.
One of the most critical parts of the identification
is the acquisition of the data from the plant, which will
be used to train and test the models. Since we want
information of its transient response (due to sudden
changes in the input) and its steady state behavior, the
number of data taken must be chosen appropriately to
reflect these events. If the proportion of data obtained
from the transient part is similar or less than the por-
tion of the steady state data, then the obtained model
will present a steady-state error, which is undesirable.
This happens because the program chooses the indi-
viduals that produce the least RMSE, thus, those in-
dividuals with their transient responses more similar
to the real plant are chosen (since most of the data
comes from that time). To avoid this, the chosen pro-
portion of data was 0.1, that is, 1 transient data for
every 10 of the steady state. The final population of
individuals will be the union of the best individuals
after 10 different runs. The criterion of termination of
the program will be when one of the models reaches
an RMSE of 0.01.
3.1 Method 1: Obtaining the Equation
that Describes the System Behavior
For determining the equation that describes the rela-
tionship between the inputs and the output of the sys-
tem, the next steps were performed:
1. Get the data from of the system, including all the
inputs and the outputs of the system.
2. Use GP to determine a set of individuals that de-
scribe accurately the relationship between the in-
Inverse Response Systems Identification using Genetic Programming
241