ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES

ROBOT ARM

S. Boudoua, M. Chettouh and M. Hamerlain

The Advanced Technologies Development Centre (CDTA), Baba Hassen, Algiers, Algeria

Keywords: Neural Network, Reinforcement Learning, Variable Structure System, Pneumatic Artificial Muscle,

Manipulator Robot Arm.

Abstract: We are concerned with the control of a 3-DOF robot arm actuated by pneumatic rubber muscles. The

system is highly non-linear and somehow difficult to model therefore resorting to robust control is

required.The work in this paper addresses this problem by presenting two types of robust control. One uses

neural network control, which has powerful learning capability, adaptation and tackles nonlinearities; in our

work the learning performed on-line is based on a binary reinforcement signal without knowing the

nonlinearities appearing in the system and no preliminary off-line learning phase is required. The other

control law is a Classical variable structure which is robust against parameters variations and external

disturbances. Experimental results together with a comparative study are presented and discussed.

1 INTRODUCTION

For most robotic applications, the common actuator

technology is electric with very limited use of

hydraulics or pneumatics but electrical systems

suffer from relatively low power/weight ratio,

especially in the case of human-friendly robot or

human coexisting and collaborative systems, such as

in medical and welfare fields. Therefore, sharing the

robot working space with its environment is

problematic. Conversely, the human arm is not very

accurate, but its lightness and joint flexibility due to

the human musculature give it a natural capability

for working in contact. A novel pneumatic artificial

muscle (PAM) actuator (Caldwell et al., 1993;

Bowler et al.), which has achieved increased

popularity to provide these advantages, has been

regarded during the recent decades as an interesting

alternative to hydraulic and electric actuators and

applied to construct a therapy robot where high level

of safety for humans is required. However, the

complex nonlinear dynamics of the PAM

manipulator makes it a challenging and appealing

system for modeling and control design. As a result,

a considerable amount of research has been devoted

to the development of various position control

systems for the PAM manipulator. The fine control

performance could be obtained by using some

control strategies such as sliding mode control (Cai

and Yamaura, 1997; Carbonell et al., 2001; Tondu

and Lopex, 2000; Hamerlain, 1995), adaptive

control and so on. However, these systems were

based on the assumption that the process to be

controlled should be linear and past of the research

results are just considered with step reference input.

Furthermore, intelligent control techniques have

emerged to overcome some deficiencies in

conventional control methods in dealing with

complex real-world systems in more recent years.

Fuzzy controllers (Balasubramanian and Rattan,

2003) have been successfully implemented for many

linear and nonlinear processes. However, there were

obviously steady-state error, and it also was very

hard to implement in practice because of the

difficulty in constructing the control rule’s bases. In

addition, neural network control has been

successfully used in many commercial and industrial

applications in recent years. An adaptive controller

based on the neural network was applied to the

artificial hand, which was composed of the PAM

(Folgheraiter et al., 2003). Nonlinear PID control to

improve the control performance of 2 axes

pneumatic artificial muscle manipulator using neural

network (NN) has been proposed by Tu Diep (Thanh

and Kwan, 2006).

The work in this paper addresses this problem by

showing the ability of the NN to learn unmodeled

nonlinear dynamics through reinforcement learning.

256

Boudoua S., Chettouh M. and Hamerlain M. (2009).

ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM.

In Proceedings of the 6th International Conference on Informatics in Control, Automation and Robotics - Robotics and Automation, pages 256-261

DOI: 10.5220/0002210602560261

 SciTePress

In this paper, we will explore a new type of

reinforcement learning algorithm (Kim and Lewis,

1997), in which the learning signal is merely a

binary "+1" or "-1", from a critic rather than an

instructive correction signal. Compared with

existing NN learning methods, where learning is

performed in a trial-and-error manner, the NN

weights in our scheme are tuned on-line, with no

off-line learning phase required, in such a fashion

that closed-loop performance is guaranteed. The

experiments were carried out in practical 3 axes

PAM manipulator and the effectiveness of the

proposed control algorithm was demonstrated and

compared with sliding mode control, which suggests

its superior performance and disturbance rejection.

2 ACTUATOR AND

MECHANICAL STRUCTURE

The three degrees of freedom (DOF) of the robot

manipulator prototype illustrated in figure 1 are

considered. It consists of a base joint, a shoulder

joint and an elbow joint, all of which are revolute.

Figure 1: Experimental robot arm.

Since the pneumatic artificial rubber muscles

(PAM) are contractile devices, in order to have a

bidirectionally actuated revolute joint, two PAM

have to be used in what is generally called an

antagonistic setup. This is illustrated in figure 2.

Figure 2: Working principle of a joint.

The muscles in this application were designed to

function as biceps. As the internal air pressure

increases, the actuator expands in its radial direction

and contracts its length. The (PAM) selected as the

actuator for this robot arm is the MAS-40 fluidic

muscle manufactured by FESTO (Pomiers, 2003).

3 DIRECT REINFORCEMENT

ADAPTIVE LEARNING

NEURAL NETWORK

CONTROL

3.1 Neural Networks

Here we employ a simple “two-layer” feedforward

neural network (NN) to approximate a general

smooth non linear function on a compact set

(Sadegh, 1993)

. According to the NN

approximation property:

(

)

)()( xxVWxf

εσ

(1)

where x= [

1 x1 x2 … xn] is the input to NN, σ( ) is an

active function, W and V are defined as the

collection of respectively, NN weights for output

and hidden layer and ɛ(x) is the NN approximation

error.

The NN in the remainder of the paper is

considered with the first layer weight

fixed. This

makes the NN linear in the parameters. Selecting a

constant

result in the NN output

(

)

χσ

Wy =

There exist constant weights

so that the

nonlinear function to be approximated can be

represented as:

(

)

)()( xWxf

εχσ

(2)

with

;)(

εε

is a known value.

Then, the functional estimate can be given by

)(

χσ

Wxf =

Where

is provided by a

certain tuning algorithm. In particular in Barron’s

paper (Barron, 1993) it was shown that neural

networks can serve as universal approximators for

continuous functions more efficiently than

traditional functional approximators, even though

there exists a fundamental lower bound on the

functional reconstruction error of

order

)

(

where

is the number of neurons in

the hidden layer.

joint3

joint2

joint1

ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM

257



3.2 Controller Design

In this paper, the detailed system dynamics and the

nonlinearities in the controlled system are assumed

to be unknown. It is only supposed that the system

belongs to a general class having a canonical

structure:

)()()(

tutdxgx

++=

(3)

with state

[]

xxxX K

)(tu

is the control

input to the plant,

d(t) the unknown disturbance with

a known upper bound bd, g(x) an unknown smooth

function and output y.

Define the reference signal as

dddd

xxxX ][

)1( −

= KK

A standard use in robotics is

the filtered tracking error

)()( tetr

Λ=

Where

][

21 n

λλλ

K=Λ

is an

appropriately chosen coefficient vector such that

λλ

+++

−



is Hurwitz (

0)( →te



exponentially as

0)( →tr

The tracking error vector is defined as

XXte

−=)(

. The full filtered tracking error

)(tr

is not allowed to be used for tuning the action

generating NN weights.

Only a reduced

reinforcement signal R is allowed.

R = sgn(r);

sgn( )

if x

otherwise

+≥

⎧

⎨

−

⎩

The time derivative of the measured performance

signal can be written as:



)()(),( tdtuXXgr

(4)

where

),(

XXg

is a fairly complex nonlinear

function of

and

.The control input

)(tu



used

to control the plant is

given by (Kim and Lewis,

1997):

)(),(

)( tvXXgrKtu

+−−=

(5)

where

),(

XXg



is provided by the NN. The

performance measurement gain matrix is

KK =

and

)(tv



is a robustifying vector that will be

determined later to offset the NN functional

reconstruction error

)(x



and disturbances

)(td

From (4), the time derivative of the performance

measure signal

)(tr

can be rewritten as:

(, ) () ()

rKr

XX dt vt

−+ + +

(6)

where

(, ) (, ) (, )

ddd

XX gXX gXX

−

The continuous nonlinear function

),(

XXg

can be represented by a NN with some constant

"ideal" weight

and some sufficient number of

input basis function

)(

as:

(

)

(, ) ()

XX W x

χε

(7)

with

εε

<)(

We assume that the ideal weight

is bounded

by known positive values (Lewis et al., 1995;

Kosmatopoulos, 1990) so that

WW≤

where

is a known value.

Let the NN functional estimate for the

continuous nonlinear function

),(

XXg

given by:

)(

),(

χσ

WXXg =

(8)

where the current value

is provided by the

weight tuning algorithm. From (3) and (4) we have

the following performance measure:

)()()()(

tvtdxWrKr

++++−=

εχσ

(9)

with the weight estimation error

WWW

−=

.

The robustifying term is given by (Kim and

Lewis, 1997):

Ktv

−=)(

(10)

with

bdK

≥



And reinforcement learning rule for

tuning the action generating NN weights is given by

(Kim and Lewis, 1997):

WFkRFW

)(

−=

χσ



(11)

with

for the learning rate and

0>k

for

the speed of convergence. Then the errors

and



are Uniformly Ultimately Bounded (UUB) (Kim and

Lewis, 1997). Moreover, the performance measure

)(tr



can be made arbitrarily small by increasing the

fixed control gain

ICINCO 2009 - 6th International Conference on Informatics in Control, Automation and Robotics

258

Proof (Kim and Lewis, 1997). Define the

Lyapunov function candidate:

)

(

WFWtrrL

−

∑

Differentiation yields:

)

()sgn(

WFWtrrrL

−

∑

Substituting now from the error system (9) and

using (11) gives:

)

()(

WWtrxRrKRL

++−≤

∑

Using the inequality:

{}

)

(

)

(

)

(

WWWWWWtrWWtr −≤−=



and

≤)sgn(



results in:

WkrKnL

ελ

++−−≤

)

()(

min



which is guaranteed negative as long as either:

)(

min

≥

++≥



According to a standard Lyapunov theory

extension (Lewis et al., 1993; Narendra and

Annaswamy, 1987), this demonstrates the UUB of

both



and



4 VARIABLE STRUCTURE

CONTROL

Sliding mode control (SMC) is a type of variable

structure control where the dynamics of a nonlinear

system is changed by switching discontinuously on

time on a predetermined sliding surface with a high

speed, nonlinear feedback (Young et al., 1999).

Actually, sliding mode controller design has two

steps: the first step involves obtaining a sliding

surface for desired stable dynamics and the second

step is about providing the control law to reach this

sliding surface. The system trajectories are sensitive

to parameter variations and disturbances during the

reaching mode whereas they are insensitive in the

sliding mode (Hung et al., 1993). Although CVS

(Classical Variable Structure) control is robust

against modelling errors, it however requires an

approximate model. Knowledge of the assumed

model parameter variation bounds is also required.

The identification of each joint dynamics is

based on the estimation of coefficients of a

presumed linear model. This is achieved by fitting

the best linear curve to the input-output data using

an ARX model (Autoregressive with exogenous

input) in MATLAB. Joint dynamic parameters are

identified using various step input signals. The

measured response for the joint angle variation θ

(radians) corresponding to various step of the

pressure between the two muscles is shown in

(figure 3).

20 40 60 80 100 120 140 160 180 200

0.1

0.2

0.3

0.4

0.5

Time

joint angle

0.4 bar

0.6 bar

0.8 bar

1 bar

Figure 3: Step response of robot arm (joint 1).

In a linear approximation, the decoupled model

for the system dynamics is given in the following

form:

uBqAqAq ...

&&&

(12)

Where

123

[, , ]

qqqq=

is the displacement

vector

and

are the estimated gain matrices

of velocity position and control. These for a

decoupled system are:

]25.117.016.0[

diagA

]55.0294.0219.1[

−

eediagA

]323.2227.0257.0[

−

eeediagB

The sliding mode occurs on a switching

surface

() 0Sx

, which forces the original system

to behave as a linear time invariant system, which

can be designed to be stable. The switching surfaces

are chosen as:

(, )

iii ii i

See e e

(13)

13i

≤



ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM

259



Where

iiid

eqq=−

with

is the

desired position. For ideal sliding to occur, the

invariance conditions

0),(0),( ==

iiiiii

eeSandeeS

must be satisfied. This yields the equivalent control:



])[(

1221

ididiidiiiiiiiieq

qqaqaeaeabU

&&&&

++++−=

−

(14)

Now, due to modelling errors, the estimated

equivalent control is given by

])[(

idid

iiieq

qqaqaeaeabU

&&&&

++++−=

−

(15)

where

iij

are estimated mean parameters.

The control

is then fixed as

iieq i

UU U=+Δ

while

UΔ

is the high frequency component which

ensures the sliding mode and consequently the

system insensitivity to parameter variations ,errors

modelling and perturbations.

The control

is discontinuous across the

switching surfaces

(,) 0

iii

See =

if S ( , ) 0

iieq i ii

UU U ee

−−

⎧

=+Δ >

⎪

⎨

=+Δ <

⎪

⎩

The discontinuous component can take several

forms in literature the form retained is established by

Harashima et al. (Harashima et al., 1986) as:

)sgn().(

iiiiiii

SeeU

γβα

++=Δ

(16)

5 EXPERIMENTAL RESULTS

Experimental results of both DRAL and CVS

control laws applied to a 3-DOF robot arm driven by

pneumatic artificial muscles are presented.

5.1 Tracking Trajectory

We present a simultaneous control of all three robot

axes for tracking a sinusoidal reference trajectory;

joint coupling is significant.

Number of hidden neurons is 20 and activation

functions are sigmoid. Experimental parameters are

as follows:

1.0

;

5.0

;

1.0

3.0

2.0

;

12.0

08.0

02.0

;

5.0

⎟

⎠

⎞

⎜

⎝

⎛

⎟

⎠

⎞

⎜

⎝

⎛

⎟

⎠

⎞

⎜

⎝

⎛

⎟

⎠

⎞

⎜

⎝

⎛

⎟

⎠

⎞

⎜

⎝

⎛

ziivi

KKFK

Figure 4: Position and signal of control of joint 1.

Figure 5: Position and signal of control of joint 2.

Figure 6: Position and signal of control of joint 3.

The performance of the DRAL controller shows

that the trajectory following ability is fairly good.

Due to its position in the robot arm the second joint

is more difficult to control because of interactions

between axes (see Figure 1), moreover, the tracking

errors converge to small values as expected from the

stability analysis. Though robot non linearity and

system dynamics are completely unknown to the

DRAL, the algorithm has good properties to cancel

the nonlinearities in the robot system, it can also be

improved by supplying NN with more input signals

(in this work we have considered that NN have to

approximate unknown second order dynamics).

5.2 Comparative Study

In order to show the ability of the DRAL to control

unknown highly non linear systems our

experimental results are compared with those

obtained using sliding mode control. Both reference

and tracking are considered.

We summarize our concluding remarks in the

tables below.

deg

bar

time(s) time(s)

deg

time(s)

bar

time(s)

deg

bar

ICINCO 2009 - 6th International Conference on Informatics in Control, Automation and Robotics

260

Figure 7: Position and SMC signal control joint 1.

Figure 8: Position and DRAL signal control joint 1.

Table 1.

DRAL VSS

Response

Time

1.5s 4 s

Chattering

Insignificant

Exist in the

transient part

Control

Not energetic

Umax=0.53bar

Energetic

Umax=2bar

Static error 0.02 degree 0.26 degree

Figure 9: Position and SMC signal control joint 1.

Figure 10: Position and DRAL signal control joint 1.

Table 2.

Control Trajectory

DRAL Umax=0.59bar Smooth

VSS

Energetic

Umax=0.9ba

Incremental

Among the disadvantages of pneumatic artificial

muscles we can underline frictions between a rubber

tube and the synthetic braid which result on

incremental trajectory tracking as shown with VSS

control (Fig 9), conversely with DRAL we have

attenuated this drawback since the trajectory

following is fairly smooth (Fig 10), which proves the

ability of neural network to learn unmodeled

nonlinear dynamics.

6 CONCLUSIONS

Due to nonlinearities and uncertainties the exact

dynamic characteristics of PAM robot manipulator

are very difficult to obtain, therefore resorting to

robust control is required. Neural network has

powerful capability of learning, adaptation and

tackling nonlinearity, the proposed neural network

controller using reinforcement learning for on line

identification of plant dynamics are simple to apply

to any control system in order to minimize the

position error without knowledge of the plant to be

controlled, the algorithm does not require any off-

line training or learning phase, the algorithm has

proven its performances through experiments and

comparative study with sliding mode control. Since

the traditional SMC design is a model-based control

approach, the partial knowledge of model dynamics

deteriorates the control performance; on the other

hand we have proven in this work that NN can

approximate any unknown complicated nonlinear

dynamics consequently, our future investigation will

focus on implementation of hybrid control law

combining these two methods.

REFERENCES

D. G. Caldwell, G. A. Medrano-Cerda, M. J.

Goodwin,“Braided pneumatic actuator control of a

multi-jointed manipulator,” in Proc. IEEE int. conf.

Systems, man and cybernetics,

Le Touque, France

1993, pp. 423–428.

C. J. Bowler, D. G. Caldwell, G. A. Medrano-

Cerda,“Pneumatic muscle actuators Musculature for

bar

time(s)

bar

time(s) time(s)

deg

bar

deg

time(s)

deg

ROBUST CONTROL FOR AN ARTIFICIAL MUSCLES ROBOT ARM

261



an anthropomorphic robot arm,” in Proc. IEE

colloquium. Actuator technology current practice and

new developments

, London, pp. 8/1–8/5.

D. Cai, H. Yamaura. “A VSS control method for a

manipulator driven by an McKibben artificial muscle

actuator,”

Electron, Commun, Japan, vol. 80, no. 3,

pp. 55-63, 1997.

P. Carbonell, ZP. Jiang, DW. Repperger. “Nonlinear

control of a pneumatic muscle actuator: backstepping

vs. sliding-mode,” in Proc. IEEE int. Conf. Control

applications, Mexico City, Mexico 2001, pp. 167-172.

B. Tondu, P. Lopex. “Modeling and control of McKibben

artificial muscle robot actuators”

in Proc.of the IEEE

.Int Conf. Control Syst

Mag 2000, vol.20, no.1, pp.15-

38.

M. Hamerlain, “An anthropomorphic robot arm driven by

artificial muscles using a variable structure control,”

Proc. IEEE/RSJ int Conf. Intelligent Robots and

Systems

, 1995, vol.1, pp. 550-555.

V. Balasubramanian, KS. Rattan, “Feedforward control of

a non-linear pneumatic muscle system using fuzzy

logic,”

in IEEE int. Conf. Fuzzy Systems, 2003, vol.1,

p. 272–277.

M. Folgheraiter, G. Gini, M. Perkowski, M. Pivtoraiko,

“Adaptive reflex control for an artificial hand”

in Proc

SYROCO 2003, symposium on robot control

, Holliday

Inn

, Wroclaw, Poland, 2003.

T. D. C. Thanh, A. K. Kwan, “Nonlinear PID control to

improve the control performance of 2 axes pneumatic

artificial muscle manipulator using neural network,”

Science Direct. Mechatronics 16, 577-587, 2006.

Y. H. Kim, F. L. Lewis, “Direct-Reinforcement-Adaptive-

Learning Neural Network Control for Nonlinear

Systems,”

Proceedings of the American Control

Conference

Albuquerque, New Mexico June 1997.

P. Pomiers.“Modular robot arm based on pneumatic

artificial rubber muscles (PARM)”, in CLAWAR

2003, Catania, Italy, 17-19 Sept 2003.

N. Sadegh, “A perceptron network for functional

identification and control of nonlinear systems,”

IEEE

Trans. Neural Networks,

vol.4, no. 6, pp. 982-988,

1993.

A. R. Barron, “Universal approximation bounds for

superposition of a sigmoidal function,”

IEEE Trans.

Inform. Theory

, vol.39, no. 3, pp. 930-945, 1993.

F. L. Lewis, A. Yesildirek, and K. Liu, “Neural net robot

controller with guaranteed tracking performance,”

IEEE Trans. Neural Networks, vol. 6, no. 3, pp. 703-

715, 1995.

E. B. Kosmatopoulos, M. M. Polycarpou, M. A.

Christodoulou, P. A. loannou, “High-order neural

network structures for identification of dynamical

systems,”

IEEE Trans. Neural Networks, vol. 6, no. 2,

pp. 422-431, 1990.

F. L. Lewis, C. T. Abdallah, and D. M. Dawson, Control

of Robot Manipulators. MacMillan, New York, 1993.

K. S. Narendra and A. M. Annaswamy, “A new adaptive

law for robust adaptation without persistent

excitation,” IEEE Trans. Automat. Control, vol. 32,

no.2, pp. 134-145, 1987.

K. D. Young, V. I. Utkin, and U. Ozguner, “A Control

Engineer’s Guide to Sliding Mode Control,”

IEEE

Trans. Control Systems Technology

, vol. 7, no. 3, pp.

328-342, May 1999.

J. Y. Hung, W. Gao, and J. C. Hung, “Variable structure

control:A survey”

IEEE Trans. Industrial Electronics,

vol. 40, no.1, pp. 2-22, 1993.

F. Harashima, H. Hashimoto, K. Maruyama, “Practical

robust control of robot arm using variable structure

system”. in Proc.of the IEEE .Int Conf.on Robotics

and Automation San Fransisco 1986

, 532-538.

ICINCO 2009 - 6th International Conference on Informatics in Control, Automation and Robotics

262