parameters to be learned in an ELM are the
connections (weights) between the hidden layer and
the output layer, which are determined with a one-
step regression type approach using Moore-Penrose
(MP) generalized inverse matrix. Thus, “ELM is
formulated as a linear-in-the-parameter model and
then solved in form of linear system of equations”
(Huang et al., 2015).
ELM is impressively proficient, fast in training,
with good generalization ability, and able to reach
global optimum with least human interference when
compared to traditional feed forward neural network
learning methods. Previous studies have shown that
ELM maintains its general approximation capability
with arbitrarily generated hidden layer weights if “the
activation function in the hidden layer are infinitely
differentiable” (Huang et al., 2006) and its learning
algorithm could be used to train SLFNs with both
non-differentiable and differentiable activation
functions (Wang et al., 2011; Li et al., 2017).
Regardless of all the established fact mentioned
above on ELM, its quick and good generalization
ability depends on generation of random weights and
selection of the number of hidden neurons, which is
clearly by chances or probabilities and thus this
sometimes lead to model process mismatch.
Furthermore, unknown disturbances commonly exist
in batch processes due to variations in raw materials.
Process equipment degradation due to wearing and
reactor fouling is common in industrial batch
processes (Zhang et al., 1999). These lead to model
process mismatches. To fix this problem, recursive
least square technique (RLS) is adopted to correct the
model plant mismatch prediction in ELM.
RLS technique is a form of adaptive filter
algorithm concept of online parameter estimation,
which estimate a plant model by repeatedly searching
for the coefficients that minimizes the weighted linear
least square cost function of that model. This is
obtained by updating the model based on error
difference between desired model and the predicted
model until the desired model is realized (i.e. Iteration
steps).
Parameter estimation are usually time varying in
many process systems which can be of two cases,
namely: the parameter estimation can be constant for
long time period and suddenly changes and
sometimes changes with time slowly as the process
operation progresses. In either case, monitoring
solution are sought. For the former case, covariance
resetting is the solution for abrupt changes while for
the latter case; forgetting factor need to be included to
correct the slow changes with time in the parameter
estimation of that process (Wigren, 1993).
This paper proposes integrating ELM with RLS
for the batch to batch adaptive modelling of fed- batch
processes. After the completion of each batch, the
ELM output layer weights are updated using data
from the newly completed batch through the RLS
algorithm. By such a means, the ELM model can keep
track of any variations in the process from batch to
batch.
The rest of this article is organized as follows:
Section 2 introduces the theories of ELM and RLS.
An isothermal fed-batch reactor case study is given in
Section 3. Section 4 presents the proposed ELM and
RLS algorithm method in modelling a fed-batch
reactor. Results and discussions are detailed in
Section 5. Finally, Section 6 gives the concluding
remarks and future works.
2 EXTREME LEARNING
MACHINE
2.1 ELM
The ELM was proposed by Huang et al. (2004) and it
is a type of single-hidden layer feed-forward neural
networks (SLFNs). Different from the traditional
SLFN training algorithm, the ELM randomly selects
the weights and bias in the hidden layer and the output
layer weights are calculated by a regression type of
method. The ELM output is given by:
(
)
=
∑
ℎ
()
=ℎ() (1)
where =
[
,…,
]
is the output layer weight
vector between the L hidden layer nodes and the lth
output node, and ℎ
(
)
=
[
ℎ
(
)
,…,ℎ
(
)]
is the
hidden layer output vector with its ith element
represented as:
ℎ
(
)
=
(
.+
)
(2)
where
=
[
,
,…,
]
is a vector of
weights connecting the ith hidden node to the inputs,
is the bias of the ith hidden nodes, x is an input
sample,
∈
is the weight linking the ith hidden
node with the output node. The hidden neuron
activation function, G, is chosen as the sigmoid
activation function in this work.
Basically, there are two main stages for ELM
training process: (i) random feature mapping, where
the hidden layer is randomly initialized to map the
input data into feature space by some nonlinear
mapping functions such as the sigmoidal function or