ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES
ROBOT ARM
S. Boudoua, M. Chettouh and M. Hamerlain
The Advanced Technologies Development Centre (CDTA), Baba Hassen, Algiers, Algeria
Keywords: Neural Network, Reinforcement Learning, Variable Structure System, Pneumatic Artificial Muscle,
Manipulator Robot Arm.
Abstract: We are concerned with the control of a 3-DOF robot arm actuated by pneumatic rubber muscles. The
system is highly non-linear and somehow difficult to model therefore resorting to robust control is
required.The work in this paper addresses this problem by presenting two types of robust control. One uses
neural network control, which has powerful learning capability, adaptation and tackles nonlinearities; in our
work the learning performed on-line is based on a binary reinforcement signal without knowing the
nonlinearities appearing in the system and no preliminary off-line learning phase is required. The other
control law is a Classical variable structure which is robust against parameters variations and external
disturbances. Experimental results together with a comparative study are presented and discussed.
1 INTRODUCTION
For most robotic applications, the common actuator
technology is electric with very limited use of
hydraulics or pneumatics but electrical systems
suffer from relatively low power/weight ratio,
especially in the case of human-friendly robot or
human coexisting and collaborative systems, such as
in medical and welfare fields. Therefore, sharing the
robot working space with its environment is
problematic. Conversely, the human arm is not very
accurate, but its lightness and joint flexibility due to
the human musculature give it a natural capability
for working in contact. A novel pneumatic artificial
muscle (PAM) actuator (Caldwell et al., 1993;
Bowler et al.), which has achieved increased
popularity to provide these advantages, has been
regarded during the recent decades as an interesting
alternative to hydraulic and electric actuators and
applied to construct a therapy robot where high level
of safety for humans is required. However, the
complex nonlinear dynamics of the PAM
manipulator makes it a challenging and appealing
system for modeling and control design. As a result,
a considerable amount of research has been devoted
to the development of various position control
systems for the PAM manipulator. The fine control
performance could be obtained by using some
control strategies such as sliding mode control (Cai
and Yamaura, 1997; Carbonell et al., 2001; Tondu
and Lopex, 2000; Hamerlain, 1995), adaptive
control and so on. However, these systems were
based on the assumption that the process to be
controlled should be linear and past of the research
results are just considered with step reference input.
Furthermore, intelligent control techniques have
emerged to overcome some deficiencies in
conventional control methods in dealing with
complex real-world systems in more recent years.
Fuzzy controllers (Balasubramanian and Rattan,
2003) have been successfully implemented for many
linear and nonlinear processes. However, there were
obviously steady-state error, and it also was very
hard to implement in practice because of the
difficulty in constructing the control rule’s bases. In
addition, neural network control has been
successfully used in many commercial and industrial
applications in recent years. An adaptive controller
based on the neural network was applied to the
artificial hand, which was composed of the PAM
(Folgheraiter et al., 2003). Nonlinear PID control to
improve the control performance of 2 axes
pneumatic artificial muscle manipulator using neural
network (NN) has been proposed by Tu Diep (Thanh
and Kwan, 2006).
The work in this paper addresses this problem by
showing the ability of the NN to learn unmodeled
nonlinear dynamics through reinforcement learning.
256
Boudoua S., Chettouh M. and Hamerlain M. (2009).
ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM.
In Proceedings of the 6th International Conference on Informatics in Control, Automation and Robotics - Robotics and Automation, pages 256-261
DOI: 10.5220/0002210602560261
Copyright
c
SciTePress
In this paper, we will explore a new type of
reinforcement learning algorithm (Kim and Lewis,
1997), in which the learning signal is merely a
binary "+1" or "-1", from a critic rather than an
instructive correction signal. Compared with
existing NN learning methods, where learning is
performed in a trial-and-error manner, the NN
weights in our scheme are tuned on-line, with no
off-line learning phase required, in such a fashion
that closed-loop performance is guaranteed. The
experiments were carried out in practical 3 axes
PAM manipulator and the effectiveness of the
proposed control algorithm was demonstrated and
compared with sliding mode control, which suggests
its superior performance and disturbance rejection.
2 ACTUATOR AND
MECHANICAL STRUCTURE
The three degrees of freedom (DOF) of the robot
manipulator prototype illustrated in figure 1 are
considered. It consists of a base joint, a shoulder
joint and an elbow joint, all of which are revolute.
Figure 1: Experimental robot arm.
Since the pneumatic artificial rubber muscles
(PAM) are contractile devices, in order to have a
bidirectionally actuated revolute joint, two PAM
have to be used in what is generally called an
antagonistic setup. This is illustrated in figure 2.
Figure 2: Working principle of a joint.
The muscles in this application were designed to
function as biceps. As the internal air pressure
increases, the actuator expands in its radial direction
and contracts its length. The (PAM) selected as the
actuator for this robot arm is the MAS-40 fluidic
muscle manufactured by FESTO (Pomiers, 2003).
3 DIRECT REINFORCEMENT
ADAPTIVE LEARNING
NEURAL NETWORK
CONTROL
3.1 Neural Networks
Here we employ a simple “two-layer” feedforward
neural network (NN) to approximate a general
smooth non linear function on a compact set
n
R
(Sadegh, 1993)
. According to the NN
approximation property:
(
)
)()( xxVWxf
TT
εσ
+=
(1)
where x= [
1 x1 x2 … xn] is the input to NN, σ( ) is an
active function, W and V are defined as the
collection of respectively, NN weights for output
and hidden layer and ɛ(x) is the NN approximation
error.
The NN in the remainder of the paper is
considered with the first layer weight
V
fixed. This
makes the NN linear in the parameters. Selecting a
constant
V
result in the NN output
(
)
χσ
T
Wy =
.
There exist constant weights
W
so that the
nonlinear function to be approximated can be
represented as:
(
)
)()( xWxf
T
εχσ
+=
(2)
with
;)(
N
x
εε
<
N
ε
is a known value.
Then, the functional estimate can be given by
)(
ˆ
)(
ˆ
χσ
T
Wxf =
Where
W
ˆ
is provided by a
certain tuning algorithm. In particular in Barron’s
paper (Barron, 1993) it was shown that neural
networks can serve as universal approximators for
continuous functions more efficiently than
traditional functional approximators, even though
there exists a fundamental lower bound on the
functional reconstruction error of
order
n
k
N
2
)
1
(
where
k
N
is the number of neurons in
the hidden layer.
joint3
joint2
joint1
ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM
257
3.2 Controller Design
In this paper, the detailed system dynamics and the
nonlinearities in the controlled system are assumed
to be unknown. It is only supposed that the system
belongs to a general class having a canonical
structure:
1
32
21
)()()(
xy
tutdxgx
xx
xx
n
=
++=
=
=
&
M
&
&
(3)
with state
[]
T
n
xxxX K
21
=
,
)(tu
is the control
input to the plant,
d(t) the unknown disturbance with
a known upper bound bd, g(x) an unknown smooth
function and output y.
Define the reference signal as
T
n
dddd
xxxX ][
)1(
= KK
&
A standard use in robotics is
the filtered tracking error
)()( tetr
T
Λ=
Where
][
21 n
T
λλλ
K=Λ
is an
appropriately chosen coefficient vector such that
1
2
1
1
λλ
+++
L
n
n
n
ss
is Hurwitz (
0)( te
exponentially as
0)( tr
).
The tracking error vector is defined as
XXte
d
=)(
. The full filtered tracking error
)(tr
is not allowed to be used for tuning the action
generating NN weights.
Only a reduced
reinforcement signal R is allowed.
R = sgn(r);
10
sgn( )
1
if x
x
otherwise
+≥
=
The time derivative of the measured performance
signal can be written as:

)()(),( tdtuXXgr
d
+
+
=
&
(4)
where
),(
d
XXg
is a fairly complex nonlinear
function of
X
and
d
X
.The control input
)(tu
used
to control the plant is
given by (Kim and Lewis,
1997):
)(),(
ˆ
)( tvXXgrKtu
dv
+=
(5)
where
),(
ˆ
d
XXg
is provided by the NN. The
performance measurement gain matrix is
v
T
v
KK =
and
)(tv
is a robustifying vector that will be
determined later to offset the NN functional
reconstruction error
)(x
ε
and disturbances
)(td
.
From (4), the time derivative of the performance
measure signal
)(tr
can be rewritten as:
(, ) () ()
vd
rKr
g
XX dt vt
−+ + +
&%
(6)
where
ˆ
(, ) (, ) (, )
ddd
g
XX gXX gXX
=
%
The continuous nonlinear function
),(
d
XXg
can be represented by a NN with some constant
"ideal" weight
W
and some sufficient number of
input basis function
)(
σ
as:
(
)
(, ) ()
T
d
g
XX W x
σ
χε
=+
(7)
with
N
x
εε
<)(
.
We assume that the ideal weight
W
is bounded
by known positive values (Lewis et al., 1995;
Kosmatopoulos, 1990) so that
M
F
WW
where
M
W
is a known value.
Let the NN functional estimate for the
continuous nonlinear function
),(
d
XXg
be
given by:
)(
ˆ
),(
ˆ
χσ
T
d
WXXg =
(8)
where the current value
ˆ
W
is provided by the
weight tuning algorithm. From (3) and (4) we have
the following performance measure:
)()()()(
~
tvtdxWrKr
v
++++=
εχσ
&
(9)
with the weight estimation error
WWW
ˆ
~
=
.
The robustifying term is given by (Kim and
Lewis, 1997):
R
R
Ktv
z
=)(
(10)
with
bdK
z
And reinforcement learning rule for
tuning the action generating NN weights is given by
(Kim and Lewis, 1997):
WFkRFW
T
ˆ
)(
ˆ
=
χσ
&

(11)
with
T
F
F
=
for the learning rate and
0>k
for
the speed of convergence. Then the errors
r
and
W
~
are Uniformly Ultimately Bounded (UUB) (Kim and
Lewis, 1997). Moreover, the performance measure
)(tr
can be made arbitrarily small by increasing the
fixed control gain
v
K
.
ICINCO 2009 - 6th International Conference on Informatics in Control, Automation and Robotics
258
Proof (Kim and Lewis, 1997). Define the
Lyapunov function candidate:
)
~~
(
2
1
1
1
WFWtrrL
T
i
m
i
=
+=
Differentiation yields:
)
~~
()sgn(
1
1
WFWtrrrL
TT
m
i
&
&
&
=
+=
Substituting now from the error system (9) and
using (11) gives:
)
ˆ
~
()(
1
WWtrxRrKRL
TT
v
T
m
i
++
=
ε
&
Using the inequality:
{}
)
~
(
~
)
~
(
~
)
ˆ
~
(
F
M
F
TT
WWWWWWtrWWtr =
and
nr
T
)sgn(
results in:
N
MM
F
v
n
W
k
W
WkrKnL
ελ
++
4
)
2
~
()(
2
2
min
&
which is guaranteed negative as long as either:
)(
4
min
2
v
N
M
Kn
n
W
k
r
λ
ε
+
Or
k
n
WW
W
N
MM
F
ε
++
42
~
2
According to a standard Lyapunov theory
extension (Lewis et al., 1993; Narendra and
Annaswamy, 1987), this demonstrates the UUB of
both
r
and
F
W
~
.
4 VARIABLE STRUCTURE
CONTROL
Sliding mode control (SMC) is a type of variable
structure control where the dynamics of a nonlinear
system is changed by switching discontinuously on
time on a predetermined sliding surface with a high
speed, nonlinear feedback (Young et al., 1999).
Actually, sliding mode controller design has two
steps: the first step involves obtaining a sliding
surface for desired stable dynamics and the second
step is about providing the control law to reach this
sliding surface. The system trajectories are sensitive
to parameter variations and disturbances during the
reaching mode whereas they are insensitive in the
sliding mode (Hung et al., 1993). Although CVS
(Classical Variable Structure) control is robust
against modelling errors, it however requires an
approximate model. Knowledge of the assumed
model parameter variation bounds is also required.
The identification of each joint dynamics is
based on the estimation of coefficients of a
presumed linear model. This is achieved by fitting
the best linear curve to the input-output data using
an ARX model (Autoregressive with exogenous
input) in MATLAB. Joint dynamic parameters are
identified using various step input signals. The
measured response for the joint angle variation θ
(radians) corresponding to various step of the
pressure between the two muscles is shown in
(figure 3).
20 40 60 80 100 120 140 160 180 200
0
0.1
0.2
0.3
0.4
0.5
Time
joint angle
0.4 bar
0.6 bar
0.8 bar
1 bar
Figure 3: Step response of robot arm (joint 1).
In a linear approximation, the decoupled model
for the system dynamics is given in the following
form:
uBqAqAq ...
21
=
+
+
&&&
(12)
Where
123
[, , ]
T
qqqq=
is the displacement
vector
12
,
A
A
and
B
are the estimated gain matrices
of velocity position and control. These for a
decoupled system are:
]25.117.016.0[
1
diagA
=
]55.0294.0219.1[
2
=
eediagA
]323.2227.0257.0[
=
eeediagB
The sliding mode occurs on a switching
surface
() 0Sx
=
, which forces the original system
to behave as a linear time invariant system, which
can be designed to be stable. The switching surfaces
are chosen as:
(, )
iii ii i
See e e
λ
=
+
&&
(13)
13i
ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM
259
Where
0
i
λ
f
,
iiid
eqq=−
with
id
q
is the
desired position. For ideal sliding to occur, the
invariance conditions
0),(0),( ==
iiiiii
eeSandeeS
&
&
&
must be satisfied. This yields the equivalent control:
])[(
1221
1
ididiidiiiiiiiieq
qqaqaeaeabU
&&&&
++++=
λ
(14)
Now, due to modelling errors, the estimated
equivalent control is given by
])[(
1
*
2
*
2
*
1
*
1
**
idid
i
id
i
i
i
ii
iiieq
qqaqaeaeabU
&&&&
++++=
λ
(15)
where
**
,
iij
ba
are estimated mean parameters.
The control
i
U
is then fixed as
*
iieq i
UU U=+Δ
while
i
UΔ
is the high frequency component which
ensures the sliding mode and consequently the
system insensitivity to parameter variations ,errors
modelling and perturbations.
The control
i
U
is discontinuous across the
switching surfaces
(,) 0
iii
See =
&
*
i
*
i
if S ( , ) 0
if S ( , ) 0
iieq i ii
i
iieq i ii
UU U ee
U
UU U ee
++
−−
=+Δ >
=
=+Δ <
&
&
The discontinuous component can take several
forms in literature the form retained is established by
Harashima et al. (Harashima et al., 1986) as:
)sgn().(
iiiiiii
SeeU
γβα
++=Δ
&
(16)
5 EXPERIMENTAL RESULTS
Experimental results of both DRAL and CVS
control laws applied to a 3-DOF robot arm driven by
pneumatic artificial muscles are presented.
5.1 Tracking Trajectory
We present a simultaneous control of all three robot
axes for tracking a sinusoidal reference trajectory;
joint coupling is significant.
Number of hidden neurons is 20 and activation
functions are sigmoid. Experimental parameters are
as follows:
.
1.0
1.0
1.0
;
5.0
5.0
5.0
;
1.0
3.0
2.0
;
12.0
08.0
02.0
;
5.0
2
5.0
=
=
=
=
=
ziivi
KKFK
λ
Figure 4: Position and signal of control of joint 1.
Figure 5: Position and signal of control of joint 2.
Figure 6: Position and signal of control of joint 3.
The performance of the DRAL controller shows
that the trajectory following ability is fairly good.
Due to its position in the robot arm the second joint
is more difficult to control because of interactions
between axes (see Figure 1), moreover, the tracking
errors converge to small values as expected from the
stability analysis. Though robot non linearity and
system dynamics are completely unknown to the
DRAL, the algorithm has good properties to cancel
the nonlinearities in the robot system, it can also be
improved by supplying NN with more input signals
(in this work we have considered that NN have to
approximate unknown second order dynamics).
5.2 Comparative Study
In order to show the ability of the DRAL to control
unknown highly non linear systems our
experimental results are compared with those
obtained using sliding mode control. Both reference
and tracking are considered.
We summarize our concluding remarks in the
tables below.
deg
bar
time(s) time(s)
deg
time(s)
time(s)
bar
time(s)
time(s)
deg
bar
ICINCO 2009 - 6th International Conference on Informatics in Control, Automation and Robotics
260
Figure 7: Position and SMC signal control joint 1.
Figure 8: Position and DRAL signal control joint 1.
Table 1.
DRAL VSS
Response
Time
1.5s 4 s
Chattering
Insignificant
Exist in the
transient part
Control
Not energetic
Umax=0.53bar
Energetic
Umax=2bar
Static error 0.02 degree 0.26 degree
Figure 9: Position and SMC signal control joint 1.
Figure 10: Position and DRAL signal control joint 1.
Table 2.
Control Trajectory
DRAL Umax=0.59bar Smooth
VSS
Energetic
Umax=0.9ba
r
Incremental
Among the disadvantages of pneumatic artificial
muscles we can underline frictions between a rubber
tube and the synthetic braid which result on
incremental trajectory tracking as shown with VSS
control (Fig 9), conversely with DRAL we have
attenuated this drawback since the trajectory
following is fairly smooth (Fig 10), which proves the
ability of neural network to learn unmodeled
nonlinear dynamics.
6 CONCLUSIONS
Due to nonlinearities and uncertainties the exact
dynamic characteristics of PAM robot manipulator
are very difficult to obtain, therefore resorting to
robust control is required. Neural network has
powerful capability of learning, adaptation and
tackling nonlinearity, the proposed neural network
controller using reinforcement learning for on line
identification of plant dynamics are simple to apply
to any control system in order to minimize the
position error without knowledge of the plant to be
controlled, the algorithm does not require any off-
line training or learning phase, the algorithm has
proven its performances through experiments and
comparative study with sliding mode control. Since
the traditional SMC design is a model-based control
approach, the partial knowledge of model dynamics
deteriorates the control performance; on the other
hand we have proven in this work that NN can
approximate any unknown complicated nonlinear
dynamics consequently, our future investigation will
focus on implementation of hybrid control law
combining these two methods.
REFERENCES
D. G. Caldwell, G. A. Medrano-Cerda, M. J.
Goodwin,“Braided pneumatic actuator control of a
multi-jointed manipulator,” in Proc. IEEE int. conf.
Systems, man and cybernetics,
Le Touque, France
1993, pp. 423–428.
C. J. Bowler, D. G. Caldwell, G. A. Medrano-
Cerda,“Pneumatic muscle actuators Musculature for
bar
time(s)
time(s)
time(s)
bar
bar
time(s) time(s)
time(s) time(s)
deg
bar
deg
deg
time(s)
deg
ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM
261
an anthropomorphic robot arm,” in Proc. IEE
colloquium. Actuator technology current practice and
new developments
, London, pp. 8/1–8/5.
D. Cai, H. Yamaura. “A VSS control method for a
manipulator driven by an McKibben artificial muscle
actuator,”
Electron, Commun, Japan, vol. 80, no. 3,
pp. 55-63, 1997.
P. Carbonell, ZP. Jiang, DW. Repperger. “Nonlinear
control of a pneumatic muscle actuator: backstepping
vs. sliding-mode,” in Proc. IEEE int. Conf. Control
applications, Mexico City, Mexico 2001, pp. 167-172.
B. Tondu, P. Lopex. “Modeling and control of McKibben
artificial muscle robot actuators”
in Proc.of the IEEE
.Int Conf. Control Syst
Mag 2000, vol.20, no.1, pp.15-
38.
M. Hamerlain, “An anthropomorphic robot arm driven by
artificial muscles using a variable structure control,”
in
Proc. IEEE/RSJ int Conf. Intelligent Robots and
Systems
, 1995, vol.1, pp. 550-555.
V. Balasubramanian, KS. Rattan, “Feedforward control of
a non-linear pneumatic muscle system using fuzzy
logic,”
in IEEE int. Conf. Fuzzy Systems, 2003, vol.1,
p. 272–277.
M. Folgheraiter, G. Gini, M. Perkowski, M. Pivtoraiko,
“Adaptive reflex control for an artificial hand
in Proc
SYROCO 2003, symposium on robot control
, Holliday
Inn
, Wroclaw, Poland, 2003.
T. D. C. Thanh, A. K. Kwan, “Nonlinear PID control to
improve the control performance of 2 axes pneumatic
artificial muscle manipulator using neural network,”
Science Direct. Mechatronics 16, 577-587, 2006.
Y. H. Kim, F. L. Lewis, “Direct-Reinforcement-Adaptive-
Learning Neural Network Control for Nonlinear
Systems,”
Proceedings of the American Control
Conference
Albuquerque, New Mexico June 1997.
P. Pomiers.“Modular robot arm based on pneumatic
artificial rubber muscles (PARM)”, in CLAWAR
2003, Catania, Italy, 17-19 Sept 2003.
N. Sadegh, “A perceptron network for functional
identification and control of nonlinear systems,”
IEEE
Trans. Neural Networks,
vol.4, no. 6, pp. 982-988,
1993.
A. R. Barron, “Universal approximation bounds for
superposition of a sigmoidal function,”
IEEE Trans.
Inform. Theory
, vol.39, no. 3, pp. 930-945, 1993.
F. L. Lewis, A. Yesildirek, and K. Liu, “Neural net robot
controller with guaranteed tracking performance,”
IEEE Trans. Neural Networks, vol. 6, no. 3, pp. 703-
715, 1995.
E. B. Kosmatopoulos, M. M. Polycarpou, M. A.
Christodoulou, P. A. loannou, “High-order neural
network structures for identification of dynamical
systems,”
IEEE Trans. Neural Networks, vol. 6, no. 2,
pp. 422-431, 1990.
F. L. Lewis, C. T. Abdallah, and D. M. Dawson, Control
of Robot Manipulators. MacMillan, New York, 1993.
K. S. Narendra and A. M. Annaswamy, “A new adaptive
law for robust adaptation without persistent
excitation,” IEEE Trans. Automat. Control, vol. 32,
no.2, pp. 134-145, 1987.
K. D. Young, V. I. Utkin, and U. Ozguner, “A Control
Engineer’s Guide to Sliding Mode Control,”
IEEE
Trans. Control Systems Technology
, vol. 7, no. 3, pp.
328-342, May 1999.
J. Y. Hung, W. Gao, and J. C. Hung, “Variable structure
control:A survey”
IEEE Trans. Industrial Electronics,
vol. 40, no.1, pp. 2-22, 1993.
F. Harashima, H. Hashimoto, K. Maruyama, “Practical
robust control of robot arm using variable structure
system”. in Proc.of the IEEE .Int Conf.on Robotics
and Automation San Fransisco 1986
, 532-538.
ICINCO 2009 - 6th International Conference on Informatics in Control, Automation and Robotics
262