A STATE-SPACE NEURAL NETWORK

FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

Karima Amoura

, Patrice Wira

and Said Djennoune

Laboratoire CCSP, Universit´e Mouloud Mammeri, Tizi Ouzou, Algeria

Laboratoire MIPS, Universit´e de Haute Alsace, 4 Rue des Fr`eres Lumi`ere, 68093 Mulhouse, France

Keywords:

Artiﬁcial neural networks, Recurrent network, State space, State estimation, System identiﬁcation, System

dynamics.

Abstract:

In this paper, a speciﬁc neural-based model for identiﬁcation of dynamical nonlinear systems is proposed. This

artiﬁcial neural network, called State-Space Neural Network (SSNN), is different from other existing neural

networks. Indeed, it uses a state-space representation while being able to adapt and learn its parameters. These

parameters are the neural weights which are intelligible or understandable. After learning, the SSNN therefore

is able to provide a state-space model of the dynamical nonlinear system. Examples are presented which show

the capability of the SSNN for identiﬁcation of multivariate dynamical nonlinear systems.

1 INTRODUCTION

The state-space representation is a very powerful tool

for modeling systems (Gauthier and Kupka, 2001). It

allows the modeling of linear and nonlinear dynami-

cal systems while keeping a temporal representation

of the events. It contains useful information directly

related to physical systems and offers thus very good

possibilities in terms of analyzing systems and plants.

An Artiﬁcial Neural Network (ANN) is an assem-

bly of connected neurons where each neuron com-

putes its output as a nonlinear weighted sum of its

inputs. If the parameters of this type of architec-

tures, i.e., the weights of the neurons, are appropri-

ately tuned, then the whole architecture is able to esti-

mate the relationship between input and output spaces

or to mimic the behavior of a plant without consider-

ing any model (Haykin, 1994), (Principe et al., 2000).

Learning is one of the most interesting properties of

the ANNs in the sense that calculating and adjusting

the weights is achieved without modeling the plant

and without any knowledge about it, but only from

examples. Examples are sampled signals measured

from the plant and representative of its behavior. In

this way, ANN are considered as relevant modeling

and approximating tools.

A Multi-Layer Perceptron (MLP) is a neural ar-

chitecture where neurons are organized in layers. Be-

side, a Recurrent Neural Network (RNN) can be con-

sidered as a MLPs enhanced by feedback connec-

tions. RNNs are considered a cornerstone in the learn-

ing theory because of their abilities to reproduce dy-

namical behaviors by mean of feedback connections

and delays in the propagation of their signals (El-

man, 1990). After learning, a RNN with a sufﬁcient

number of neurons is able to estimate any relation-

ships and therefore to reproduce the behavior of any

multivariate and nonlinear dynamical systems (Wer-

bos, 1974). Therefore, RNNs received a consider-

able attention from the modern control community

to such an extend that they have been formalized

in Model-Referencing Adaptive Control (MRAC)

schemes (Narendra and Parthasarathy, 1990), (Chen

and Khalil, 1992). Their model stability remains one

of the most critical aspects.

The deterministic state-space representation and

the learning RNN can both be employed for modeling

dynamical systems. However, they are characterized

by very different ways of storing information. If the

ﬁrst approach directly relies on physical parameters

of the system, the second approach uses the weights

of the neurons. These weights are inherent of a neu-

ral architecture and can generally not be interpreted.

Combining these two approaches would combine the

advantages of one and the other. This is the case ofthe

State-Space Neural Network (SSNN), a very speciﬁc

RNN based on a state-space representation (Zamar-

reno and Vega, 1998).

In this paper, the SSNN is proposed for the identi-

ﬁcation of multivariate nonlinear dynamical systems.

369

Amoura K., Wira P. and Djennoune S..

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS.

DOI: 10.5220/0003680503690376

In Proceedings of the International Conference on Neural Computation Theory and Applications (NCTA-2011), pages 369-376

ISBN: 978-989-8425-84-3

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

The architecture of the SSNN differs from conven-

tional neural architectures by being compliant to a

state-space representation. Indeed, the SSNN is de-

voted to approximate the nonlinear functions between

the input, state, and output spaces. It is therefore a

state-space formalism enhanced by learning capabil-

ities for adjusting its parameters. Its state represen-

tation is accessible and can be used moreover for the

design of adaptive control schemes. Previous RNN-

based approaches (Kim et al., 1997), (Mirikitani and

Nikolaev, 2010) can be good candidates for yielding

adaptive observers. However, they are restricted to

some classes of nonlinear systems. On the other hand,

the SSNN is able to describe virtually any nonlinear

system dynamics with a state-space representation. It

is therefore of a considerable interest for identiﬁcation

and control purposes.

The paper is organized as follows. In Section II,

the SSNN equations are developed. Two simulation

examples are provided in Section III to illustrate and

to compare the performance of the SSNN used for the

system identiﬁcation of nonlinear plants. Some con-

cluding remarks are provided at the end of the paper.

2 THE STATE-SPACE NEURAL

NETWORK (SSNN)

2.1 Architecture

The general formulation of a discrete-time process

governed by a nonlinear difference equation can be

written by



x(k+ 1) = F(x(k), u(k))

y(k) = G(x(k)) + v(k)

. (1)

The evolution of the process is represented by its

internal state x ∈ R

. The process takes the control

signals u ∈ R

as the inputs and outputs measure-

ments y ∈ R

. F and G are nonlinear multivariate

functions representing the process nonlinearities.

The SSNN is a special RNN whose architecture

exactly mirrors a nonlinear state-space representation

of a dynamical system (Zamarreno and Vega, 1998;

Zamarreno and Vega, 1999; Zamarreno et al., 2000;

Gonzalez and Zamarreno, 2002). The SSNN is com-

posed of ﬁve successive layers: an input layer, a hid-

den layer S, a state layer, a hidden layer O, and an

output layer. The input layer takes the input signals

and delivers them to every neurons of hidden layer S.

This layer describes the states behavior by introduc-

ing a form of nonlinearity. The state layer is com-

posed of neurons receiving the signals from hidden

{

input hidden state hidden output

x(k− 1)

u(k)

RNA1

x(k)

RNA2

y(k)

Figure 1: Architecture of the SSNN, a) the two-stage neural

block representation, b) the SSNN signal ﬂow graph.

layer S as inputs. Each neuron in this layer represents

one state whose output value is an estimated value of

the state. The estimated states are used by the next

hidden layer O which relates the states to the output

layer via a nonlinear function. The output layer is

composed of neurons taking the hidden layer signals

as inputs. The outputs of the neurons composing the

output layer ﬁnally representthe outputs of the SSNN.

The SSNN topology can be also considered as a two-

stage architecture with two ANN blocks, ANN1 and

ANN2, separated by an internalestimated state-space.

This architecture is equivalent to a deterministic non-

linear system in a state-space form whose mathemat-

ical representation is a particular case of (1):



ˆx(k+ 1) = W



ˆx(k) + W

u(k) + B



+ B

ˆy(k) = W



ˆx(k) + B



+ B

(2)

where ˆy ∈ R

and ˆx ∈ R

represent the estimation

of y and x respectively. The other parameters are:

• W

∈ R

× R

, W

∈ R

× R

, W

∈ R

× R

∈ R

× R

and W

∈ R

× R

are weight-

ing matrices;

• B

∈ R

, B

∈ R

, B

∈ R

and B

∈ R

are bias

vectors;

• F

, R

→ R

and F

, R

→ R

are two static

and nonlinear functions.

The architecture of the SSNN is represented by

the schematic diagram of Fig. 1 a). Its ﬁve layers

are represented by the signal ﬂow graph of Fig. 1 b)

where black (or full) nodes are variables linearly de-

pending from the previous ones and where white

(or empty) nodes are variables nonlinearly depend-

ing from the previous ones. Compared to its initial

formulation (Zamarreno and Vega, 1998; Zamarreno

and Vega, 1999; Zamarreno et al., 2000), the form

of (2) has been enhanced with biases B

and B

NCTA 2011 - International Conference on Neural Computation Theory and Applications

370

order to allow more ﬂexibility to the neural architec-

ture. These additional degrees of freedom will allow

better performances for learning and estimating a pro-

cess. The learning consists in ﬁnding the optimal val-

ues of the weights and biases. The parameters ﬁxed

by the designer are the functions F

and F

, and the

initial values

x(0). It is important to notice that: 1)

the SSNN needs an initial state value, 2) the numbers

of inputs and outputs of the SSNN (therefore n and

m respectively) are ﬁxed by those of the plant to be

modeled, 3) h and h2, the number of neurons in the

hidden layers, are let to the designer’s appreciation.

The SSNN is able to reproduce the same behavior

as the process with weights and biases correctly ad-

justed. This means that the SSNN is able to yield a

signal

y(k) very close to the output y(k) of the pro-

cess for the same control signal u ∈ R

. Furthermore,

and contrary to other ANNs, the SSNN is also able

to provide an estimation

x(k) of the state x(k) at any

instant k due to its speciﬁc architecture. Obviously,

the performance depends on the learning, i.e., if the

parameters have been well adjusted.

2.2 Parameters Estimation

Learning or training addresses the parameters estima-

tion of the neural technique. The problem of interest

consists in using data sets from the system in order

to ﬁnd the best possible weights so that the ANN re-

produces the system behavior. The learning of the

SSNN is based on the general learning theory for

a feed-forward network with n

′

input and m

′

output

units (Haykin, 1994). It can consist of any number of

hidden units and can exhibit any desired feed-forward

connection pattern.

It is therefore a nonlinear optimization problem

based on a cost function which must be deﬁned to

evaluate the ﬁtness or the error of a particular weight

vector. The Mean Squared Error (MSE) of the net-

work is generally used as the performance index and

must be minimized:

E =

∑

k=1

ke(k)k

∑

k=1



y(k) − d

′

(k)



(3)

with a given training set

{(x

′

(1), d

′

(1)), ..., (x

′

(N), d

′

(N))} consisting of

N ordered pairs of of n

′

- and m

′

-dimensional vectors

which are called the input and output patterns.

The weights of the ANN are adjusted via gradi-

ent descend methods to minimize the MSE between

the desired response d

′

(k) and the actual output

y(k)

of the network. Several learning algorithms have

been proposed in the literature (Werbos, 1974), (El-

man, 1990), (Chen and Khalil, 1992), (Principe et al.,

0 100 200 300 400 500 600 700 800 900 1000

-7

-6

-5

-4

-3

-2

-1

epochs

Performance

Figure 2: MSE values obtained during the training phase

for plant example 1.

2000). The backpropagation algorithm with the

Levenberg-Marquardt method is used to ﬁnd a min-

imum for the MSE. The network is initialized with

weights randomly chosen between -1 and 1.

In the SSNN, we assume that ANN1 and ANN2

of are trained with this principle by using respec-

tively

...,





u(k)

x(k− 1)



, s(k)



, ...

and

{..., (s(k), y(k)), ...} for their training sets.

3 SIMULATION TESTS

AND RESULTS

In this study, the SSNN is applied to system identi-

ﬁcation of two multivariate nonlinear dynamical sys-

tems in order to evaluate and to compare its perfor-

mance. Indeed, it is used to learn a state-space repre-

sentation of these systems.

3.1 Example 1

We consider the following process, governed by a de-

terministic second-order nonlinear state-space repre-

sentation:











(k+ 1) = 1.145x

(k) + 0.549x

(k) + 0.584u(k)

(k+ 1) =

(k)

1+0.01x

(k)

0.181

0.549

u(k)

y(k) = 4tanh(0.250x

(k))

(4)

with the state vector x(k) =



(k) x

(k)



and a

one dimensional control signal u(k).

This process is estimated by a SSNN with acti-

vation functions F

and F

that are respectively sig-

moid type and tanh-type. The other parameters are

n=1, s = 2, and m = 1. The number of neurons in

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

371

the hidden layers of ANN1 and ANN2 are ﬁxed by a

trial-and-errorprocess and the best performanceis ob-

tained with h = h2 = 2 neurons in each hidden layer.

The initial conditions of the SSNN are the following:

the weights are randomly chosen between -1 and 1,

and

x(0) is chosen as null. In order to train the SSNN,

1800 training inputs are generated with a sinusoidal

control signal with different values of the amplitude

and frequency. The training error values vs. the num-

ber of epochs are shown in Fig. 2.

After learning with the Levenberg-Marquardt al-

gorithm, the plant described by (4) is estimated by

the SSNN according to



ˆx(k + 1) = W

logsig



ˆx(k) + W

u(k) +B



+ B

ˆy(k) = W

tansig



ˆx(k) + B



+ b

(5)

with the following optimum weights:



−0.0340 0.0162

−0.2375 −0.2529





−134.8114 0.0261

−81.8352 −5.2239





−0.0173 0.0211





−0.0023 −0.0001





67.3160 43.4829





−5.6543 2.3457





−0.500 0.0000

0.500 0.0000





0.1812 0.4369



.10

−3

= 1.6543.

(6)

The SSNN with the previous parameters is evalu-

ated on a test sequence. This allows to compare the

behavior of the SSNN to those of the plant by using

a same control signal composed of steps with various

amplitudes. The results are presented by Fig. 3 which

shows the control input, the two states, the output and

the estimated states and output. This ﬁgure shows at

the same time, the different between the output and its

estimation and the difference between the states and

their estimation. The maximum value of the MSE on

the output of the SSNN is 25 .10

−6

; this demonstrates

the ability of the SSNN for modeling this nonlinear

dynamical plant.

3.2 Example 2

In this example, the plant to be identiﬁed is a four-

order nonlinear system (s = 4) with m = 4 outputs:

0 50 100 150 200 250 300

-4

-2

(p.u.)

ˆx

0 50 100 150 200 250 300

-4

-2

(p.u.)

ˆx

0 50 100 150 200 250 300

-3

-2

-1

(p.u.)

0 50 100 150 200 250 300

-0.03

-0.02

-0.01

0.01

0.02

0.03

time (iterations)

errors (p.u.)

y − ˆy

ˆx

x -

ˆx

x -

Figure 3: Performances of the SSNN in identifying plant

example 1.

















(k+ 1)













(k)

psinx

(k) + p+ x

(k)



















u(k)

y(k) = tanh(x(k))

(7)

where parameter p = 0.85. The plant is linear with

p = 0, nonlinear with p > 0, and unstable with p ≥ 1.

The plant is controlled by one input signal u (there-

fore n = 1) which is a sinusoidal signal with differ-

ent values of the period, mean (offset) and ampli-

tude. The training set of the SSNN is composed of

1000 data samples of u, y, and x). The plant non-

linearities are introduced in the SSNN with functions

(.) = logsig(.) and F

(.) = tansig(.). For simplic-

ity, the following initial conditions are considered:

x(0) = 0 and

x(0) = 0. The parameters of the SSNN

are randomly initialized between -1 and 1 and are

adjusted with the training data set according to the

Levenberg-Marquardt algorithm.

After learning ex-nihilo, the plant of (7) is identi-

ﬁed by



ˆx(k+1) = W

logsig



ˆx(k) + W

u(k) + B



+ B

ˆy(k) = W

tansig



ˆx(k) + B



+ B

(8)

NCTA 2011 - International Conference on Neural Computation Theory and Applications

372

0 200 400 600 800 1000 1200 1400 1600 1800

−4

−2

0 200 400 600 800 1000 1200 1400 1600 1800

−20

−10

0 200 400 600 800 1000 1200 1400 1600 1800

−2

−1

ˆy

ˆx

Figure 4: Test sequence of the SSNN with an oscillating

control signal.

The best results for the SSNN are obtained with

h = 8 and h2 = 4 with the following values:







0.0001 0.0211 0.0209 −0.0312

0.7841 −0.0345 −0.0740 0.1457

0.5381 0.0135 −0.0937 0.0806

−0.2618 0.0035 0.0632 −0.0544

−0.0001 −0.0198 0.0150 −0.0128

−0.4081 −0.0045 0.0763 −0.0678

−0.0000 −0.0195 0.0075 0.0199

0.6801 0.0027 −0.0052 0.0534













91.3394 27.7993 34.2220 127.4106

0.0043 20.4352 −0.0179 −0.0017

0.0563 −230.9712 0.2334 0.1122

0.1380 321.3377 −0.2097 −0.0990

−151.3135 −49.3097 −131.2520 −37.2734

0.0636 −501.6847 0.4946 0.2431

46.5102 106.5639 169.5512 175.2242

−0.0046 23.6573 −0.0277 −0.0094













0.0247

−0.1018

−0.0217

0.0102

0.0176

0.0179

0.0087

−0.0639







, B







−0.0409

−10.0577

−1.6909

0.7563

0.0391

1.2601

0.0029

4.5821







8.9622 143.6209 −35.0017 −131.2700







31.3282 −16.9499 −39.0042 25.6931

−0.0323 0.0403 −1.0431 0.0221

−0.0284 0.0387 0.0360 0.9379

−0.5062 −0.4215 0.0543 −0.0515













0.0651 −0.0354 −0.1039 −1.0478

−0.0697 0.0862 0.1236 −0.9888

−0.0029 −0.9360 0.0519 −0.0130

0.0016 0.0042 1.0206 0.0171













9.3989

0.0074

0.0083

0.0159







, B







−0.0583

0.0601

0.0036

−0.0031







(9)

First, we report results for a test sequence based

0 200 400 600 800 1000 1200 1400 1600 1800

−0.4

−0.3

−0.2

−0.1

0.1

0.2

0.3

x − ˆx

0 200 400 600 800 1000 1200 1400 1600 1800

−0.5

0.5

y − ˆy

Figure 5: Estimating errors of the SSNN during the test se-

quence.

on an oscillating control signal u ∈ [−3, 2] composed

of 1800 data points. Fig. 4 shows the control signal

u, but also x, y,

x, and

y. The estimation errors x−

and y−

y are presented by Fig. 5. The performance in

estimating the plant with the SSNN can also be eval-

uated by the MSE and the maximum error on x and

reported in the left part of Table 1 (test sequence). It

can be seen from this table that the maximum error is

less than 1.4% in estimating the states and less than

22.5% in estimating the output. The greatest errors

are recorded on the transcients and the static error is

negligible compared to the range of the output.

In order to go further into the SSNN estimation ca-

pabilities, we evaluate its response to a step and com-

pare it to the original plant. Fig. 6 shows the states

of the SSNN and of the plant. Fig. 7 shows the out-

puts of the SSNN and of the plant. It can be seen that

the behavior of the SSNN is very close to the one of

the plant. This is conﬁrmed by the errors, i.e., MSE

and static errors reported in the right part of Table 1

(step response). In this test, the behavior of the SSNN

is nearly the same as that of the plant, all the more

so since the training sequence of the SSNN was not

composed of steps but only of sinusoidal waves. This

demonstrates that the SSNN is successful in identiﬁ-

cation and its the good capabilities to generalize.

If system identiﬁcation includes estimating the re-

lation between the input and the output of the system,

this can be achieved by a MLP and can be used as a

comparison to the SSNN. Fig. 8 shows the principle

of identifying a plant with a MLP using delayed in-

put and output signals. The input of the MLP are de-

layed values of the control signal u(k) and of the out-

put of the process y(k) in order to capture the process

dynamics. We chose to update the MLP weights ac-

cording to the Levenberg-Marquardt algorithm from

a random initialization between -1 and 1 and with the

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

373

Table 1: SSNN errors in identifying plant example 2.

test sequence step response

MSE max. error MSE static error

0.0007 10

−3

0.0157 0.2177 10

−3

-0.1246

0.5228 10

−3

0.3474 0.3753 10

−3

-0.1155

0.0014 10

−3

0.0141 0.0297 10

−3

-0.2084

0.0012 10

−3

0.0088 0.0237 10

−3

-0.1995

0.0031 10

−3

0.4145 0.4535 10

−3

-0.1105

0.0034 10

−3

0.4495 0.5477 10

−3

-0.1027

0.0000 10

−3

0.0259 0.0078 10

−3

-0.1998

0.0000 10

−3

0.0386 0.0113 10

−3

-0.1998

0 20 40 60 80 100

ˆx

temps (itérations) temps (itérations)

(sans unité)

temps (itérations) temps (itérations)

Figure 6: Ideal states and states estimated by the SSNN for

an input step (plant example 2).

0 10 20 30 40 50

0.2

0.4

0.6

0.8

1.2

0 10 20 30 40 50

0.2

0.4

0.6

0.8

1.2

0 10 20 30 40 50

0.2

0.4

0.6

0.8

0 10 20 30 40 50

0.2

0.4

0.6

0.8

ˆy

temps (itérations) temps (itérations)

(sans unité)

temps (itérations) temps (itérations)

Figure 7: Ideal outputs and outputs estimated by the SSNN

for an input step (plant example 2).

same training set as for the SSNN. After the training

period, the MLP has been evaluated with the same test

sequence as for the SSNN and with a step response.

Results obtained with a MLP that uses 10 scalar in-

puts (i.e., u(k), u(k−1), y(k) and y(k−1)), 8 neurons

in one hidden layer and 4 outputs are presented in Ta-

ble 2. They can be compared to the ones obtained

with the SSNN in Table 1. The errors of the MLP and

of the SSNN in yielding the output are of the same

order of magnitude. However, the MLP is a nonlinear

regression structure that represents input-output map-

pings by weighted sums of nonlinear functions. The

MLP is therefore not able to estimate the state signals

of the plant.

3.3 Discussion

Computational cost is generally considered as a ma-

jor shortcoming of ANNs when identifying and con-

trolling dynamical systems in real-time. Their main

interest is to used an on-line learning, i.e., to adjust

the weights while controlling of the system at the

same time. This means that both, learning and con-

trolling, are achieved within each iteration. This al-

lows to instantaneously take into account the varia-

tions and ﬂuctuations of the system’s parameters and

the eventual external disturbances affecting the sys-

tem. The computational costs for calculating the out-

put and updating the weights have to be compatible

with the sampling time.

The number of neurons is generally provided to

give an idea about the size of the ANN. The num-

ber of neurons is h + s + h2 + m for a SSNN noted

down by SSNN(n, h, s, h2, m) with n, h, s, h2 and

m, the dimensions of the SSNN ﬁve successive lay-

ers. The number of neurons is m2 + m3 for a MLP

with m1 inputs, m2 neurons in one hidden layer, and

m3 outputs, i.e., MLP(m1, m2, m3). In example 2,

the followingdimensions were imposed on the SSNN,

n = 1, s = m = 4 and m1 = 10 and m3 = 4 on the

MLP. Very close performances were obtained with

SSNN(1,8,4,4,4) and MLP(10,8,4), i.e., respectively

with 20 and 12 neurons. However, this number is not

really representative of the memory required for the

implementation. The total number of scalar param-

eters is much more signiﬁcant and can be calculated

for the MLP and for the SSNN respectively as general

NCTA 2011 - International Conference on Neural Computation Theory and Applications

374

Table 2: MLP errors in identifying plant example 2.

test sequence step response

MSE max. error MSE static error

0.0045 10

−3

0.0199 0.3995 10

−3

-0.1087

0.0047 10

−3

0.0175 0.5021 10

−3

-0.1012

0.0038 10

−3

0.0166 0.0066 10

−3

-0.1988

0.0039 10

−3

0.0169 0.0830 10

−3

-0.1907

ANN

plant

u(k)

y(k)

−1

, ..., z

−d

−1

, ..., z

−d

e(k)

Figure 8: Typical identiﬁcation of a plant with a ANN.

functions of the number of hidden neurons by

MLP

= f(m2) = (1+ m1+ m3)m2 + m3,

SSNN

= f(h, h2)

= (1+ n+ 2s)h + (1+s+ m)h2+ s+ m.

More speciﬁcally, for plant example 2, p

SSNN

10 h + 9 h2 + 8 and p

MLP

= 15 m2 + 4. With h = 8

and h2 = 4 for the SSNN and with m2 = 8 for the the

MLP, both approaches present the same error in iden-

tiﬁcation and with the same number of parameters to

be adjusted (124) but respectively with 20 and 12 neu-

rons. This means that a SSNN which uses 20 neurons

is equivalent in terms of performance and of memory

size than a MLP with 12 neurons.

Finally, we evaluate the computational coast of

both neural approaches by specifying the number of

scalar operations involved for calculating the esti-

mated output. Calculating the output of a simple

neuron with m1 inputs and 1 bias requires m1 scalar

multiplications, m1 scalar additions and the process-

ing through 1 scalar nonlinear function. This general

fact can be noted down by m1⊗ +m1 ⊕ +1f. Cal-

culating the output of MLP(m1, m2, m3), means to

“propagate” the inputs signals through each neurons

of all the successive layers. The number of the con-

sidered scalar operations is therefore: m2(m1+m3)⊗

+m2(m1 + m3) ⊕ +(m2 + m3) f. We determine the

number of scalar operations for calculating the out-

put of the SSNN as (nh + 2sh+ sh2+ oh2) ⊗ +(nh+

2sh+ sh2+ oh2) ⊕ +(h+ s+ h2+ o) f.

For the speciﬁc case of plant example 2,

MLP(10,8,4) needs 112 ⊗ +112 ⊕ +12f while

SSNN(1,8,4,4,4) requires only 104 ⊗ +104⊕ +20f.

The number of scalar operations for the training are

not detailed here because it depends on the gradi-

ent method that is used. Adapting the weights for

the SSNN generally requires less efforts than for the

MLP.

Learning with MLPs remains to estimate the

input-output relationship of a system. This is a way

to describe the system dynamics when conventional

tools are not efﬁcient in modeling. However, MLPs

generally needs a substantial number of time-delayed

signals as additional inputs to efﬁciently capture the

system dynamics. This means a large number of

weights and therefore introduces some difﬁculties in

its learning convergence(Haykin, 1994). On the other

side, the SSNN outdoes the simple input-output re-

lationship estimation problem. Its captures the sys-

tem dynamics with an architecture that inherently

mirrors a state-space representation. The SSNN al-

lows the reconstruction of a state space and gives

access to the states values. The numerical values

presented here shows that, compared to a MLP, the

SSNN presents the best compromise between com-

putational resources and performances. The SSNN is

more compliant to real-time constraints than the MLP.

Accordingly, the SSNN is well suited to adaptivecon-

trol schemes based on state-space representations.

4 CONCLUSIONS

In this paper, a State-Space Neural Network (SSNN)

is evaluated for modeling dynamical nonlinear sys-

tems. This neural approach is a particular type of a

recurrent neural network based on a state-space rep-

resentation. After learning from example data, the

SSNN allows the reconstruction of a state-space rep-

resentation of any nonlinear system. Furthermore, the

SSNN is also able to followthe evolution of the states.

The effectiveness of the SSNN has been illustrated by

simulation examples and results demonstrate the ef-

fectiveness of this adaptive observer. These examples

verify the accuracy of the SSNN in modeling multi-

variate dynamical and nonlinear plants. Finally, the

SSNN is compared to a rough implementation of a

Multi-Layer Perceptron and a thorough study of all

the scalar operations and memory sizes of the two ap-

proaches shows that the SSNN uses reduced compu-

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

375

tational costs while allowing the same estimation per-

formance or better parameter tracking capability for

the same computational costs.

REFERENCES

Chen, F. and Khalil, H. (1992). Adaptive control of non-

linear systems using neural networks. International

Journal of Control, 55(6):1299–1317.

Elman, J. (1990). Finding structure in time. Cognitive Sci-

ence, 14(2):179–211.

Gauthier, J.-P. and Kupka, I. (2001). Deterministic obser-

vation theory and applications. Cambridge University

Press, Cambridge, UK.

Gonzalez, P. A. and Zamarreno, J. M. (2002). A short-term

temperature forecaster based on a state space neural

network. Engineering Applications of Artiﬁcial Intel-

ligence, 15(5):459–464.

Haykin, S. (1994). Neural Networks : A comprehensive

Foundation. Macmillan College Publishing Company,

Inc., New York.

Kim, Y. H., Lewis, F. L., and Abdallah, C. T. (1997). A

dynamic recurrent neural-network-based adaptive ob-

server for a class of nonlinear systems. Automatica,

33(8):1539–1543.

Mirikitani, D. and Nikolaev, N. (2010). Recursive bayesian

recurrent neural networks for time-series modeling.

IEEE Transactions on Neural Networks, 21(1):262 –

274.

Narendra, K. and Parthasarathy, K. (1990). Identiﬁca-

tion and control of dynamical systems using neural

networks. IEEE Transactions on Neural Networks,

1(1):4–27.

Principe, J. C., Euliano, N. R., and Lefebvre, W. C. (2000).

Neural and Adaptive Systems: Fundamentals Through

Simulations. John Wiley and Sons.

Werbos, P. (1974). Beyond Regression: New tools for pre-

diction and analysis in the behavioral sciences. Ph.d.

thesis, Harvard University.

Zamarreno, J. and Vega, P. (1998). State space neural net-

work. properties and application. Neural Networks,

11(6):1099–1112.

Zamarreno, J., Vega, P., Garca, L., and Francisco, M.

(2000). State-space neural network for modelling,

prediction and control. Control Engineering Practice,

8(9):1063–1075.

Zamarreno, J. M. and Vega, P. (1999). Neural predic-

tive control. application to a highly non-linear sys-

tem. Engineering Applications of Artiﬁcial Intelli-

gence, 12(2):149–158.

NCTA 2011 - International Conference on Neural Computation Theory and Applications

376