Genetic Programming Applied to Biped Locomotion Control with

Sensory Information

C´esar Ferreira

, Pedro Silva

, Jo˜ao Andr´e

, Cristina P. Santos

and Lino Costa

Industrial Electronic Department, University of Minho, Azurem Campus, Guimar˜aes, Portugal

Production Systems Department, University of Minho, Gualtar Campus, Braga, Portugal

Keywords:

Biped Locomotion, CPG (Central Pattern Generator), Sensory Information.

Abstract:

Generating biped locomotion in robotic platforms is hard. It has to deal with the complexity of the tasks which

requires the synchronization of several joints, while monitoring stability. Further, it is also expected to deal

with the great heterogeneity of existing platforms. The generation of adaptable locomotion further increases

the complexity of the task.

In this paper, Genetic Programming (GP) is used as an automatic search method for motion primitives of

a biped robot, that optimizes a given criterion. It does so by exploring and exploiting the capabilities and

particularities of the platform.

In order to increase the adaptability of the achieved solutions, feedback pathways were directly included into

the evolutionary process through sensory inputs.

1 INTRODUCTION

There is an increasing interest in building autonomous

systems to aid humans performing tasks in a wide

variety of situations. Ranging from space and deep

ocean exploration, or rescue missions in hazardous

environments, to in everyday tasks, such as cleaning

the house or taking care of the elderly. In most of

these cases, legged locomotion may provide for an

advantage over wheeled or tracked robots. It offers

a higher level of ﬂexibility required in a wide vari-

ety of terrains and the ability to deal with harsher ter-

rain features, e.g. stairs, obstacles, uneven or irreg-

ular terrain. Particularly, biped locomotion provides

the ﬂexibility to a world shaped for humans. The con-

trol and generation of biped locomotion for the ever

improving biped robots is a very demanding task, ad-

dressing complex problems as the generation of the

movements and coordination between many degrees

of freedom, balancing, perception and planning, and

disturbance rejection.

Typical solutions to the problem of biped locomo-

tion make extensive use of the knowledge of the robot

and environment. Generally a plan of the path and

foot placement sequence is determined, then the re-

quired motions are computed using the robot’s kine-

matical model, respecting determined constraints es-

tablished through some stability criterion, as the pop-

ular Zero-Moment Point (Vukobratovi´c and Borovac,

2004). However, such approach requires a good

perception of the environment which may hamper

the general application to different dynamic environ-

ments.

Alternatively to these typical solutions, bio-

inspired approaches have been researched and pro-

posed with quite successful results. One of these ap-

proaches uses the concept of Central Pattern Genera-

tors (CPGs), exploiting the interesting characteristics

of intraspinal neural networks in vertebrates (Ijspeert,

2008). These generate rhythmic activation for walk-

ing motor patterns. The main characteristic that moti-

vates for the application of CPGs in the generation of

robotic legged locomotion is the ability to adapt and

correct the locomotion by the integration of sensory

feedback pathways (Kim et al., 2011). This provides

the ability for the robots to tackle unexpected dis-

turbances and not completely known environments.

However, there is no established framework for de-

signing CPG solutions and such sensory feedbacks.

Previously, we have proposed a CPG based solu-

tion for biped locomotion (Matos and Santos, 2012).

It combines a small set of motion primitives within

CPGs driven by phase oscillators, producing basic

but very capable biped walking for the DARwIn-OP

humanoid robot. Despite the simplicity of the solu-

tion, the expansion of the repertoire of motion prim-

Ferreira C., Silva P., André J., P. Santos C. and Costa L..

Genetic Programming Applied to Biped Locomotion Control with Sensory Information.

DOI: 10.5220/0005062700530062

In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2014), pages 53-62

ISBN: 978-989-758-039-0

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

itives to broaden the locomotor behaviors has proven

complex, as well as the design of feedback mecha-

nisms for the adaptation and correction of locomo-

tion. Some authors tackle this problem through im-

itation and learning from demonstration (Nakanishi

et al., 2004), optimization of parameterized trajecto-

ries (Kim et al., 2009) or reinforcementlearning (Sug-

imoto and Morimoto, 2011). In this work, we take a

distinct approach, where the goal is to apply Genetic

Programming to the automatic exploration of: 1) the

motion primitives within the CPG, and 2) the integra-

tion of sensory inputs into feedback mechanisms for

the adaptability to the environment.

Evolutionary Computation (EC) algorithms rely

on the concept of Darwin’s evolution theory to ﬁnd

optimized solutions for a target problem, such as

Genetic Algorithms (GA) and Genetic Programming

(GP). The former considers a control policy whose

conﬁguration is evolved as a string of chromosomes

- conﬁguration parameters for a target problem. The

latter evolves a complete control program for the task

at hand. These methods use a ﬁtness function that

evaluates the candidate solutions, or individuals, and

whose value is used as quality measure for a set of

evolutionary operators (selection, crossover and mu-

tation).

Candidate solutions in GP, or individuals, can

fully describe the solution to the target problem, not

requiring any a-priori structure. Therefore, although

the complexity of the search space is increased, it is

expected to generate more adequate solutions to a par-

ticular problem.

GP has proven to be useful in the generation of

locomotion for very different types of robotic plat-

forms, thus showing its efﬁciency in ﬁnding solutions

for problems with a high levelof complexity. In (Gritz

and Hahn, 1997) a generic controller for an animated

physically plausible 3D character was created: an ar-

ticulated lamp. In (Tanev et al., 2005) a locomotion

controller for an articulated, snake like robot was cre-

ated. GP was also employed to generate a legged lo-

comotion controller for a quadruped robot in (Ander-

sson et al., 2000).

This method has also been applied in the genera-

tion of biped locomotion. In (Ok et al., 2001), GP was

applied in the automatic generation of feedback neu-

ral networks for the control of a simulated 3D biped

model with 32 muscles that controlled rigid segments

of the legs, body and arms. The model was able to

generate locomotion during only four steps. In (Ok

and Kim, 2005), these results were improved by ap-

plying an enhanced adaptive mutation operator that

reduced the search space and improved the evolution

results, increasing the generated steps to 10. Although

this work yielded interesting results, it is applied on a

very speciﬁc model, which physical and mechanical

properties do not ﬁt common biped robotic platforms.

Other works address the generation of controllers

to robotic platforms through the use of GP. In (Wolff

and Wahde, 2007), Linear Genetic Programming

(LGP) was used to generate a locomotion controller

with feedback pathways, for a robust and anthropo-

morphic biped robot model. The model is simulated

butphysically plausible. There was no a-priori knowl-

edge about the mechanical or physical properties of

the body. Instead, the evolution uses feedback from

several sensor modalities (e.g. joints and several ac-

celerometers in the body and in the legs) to success-

fully achieve biped locomotion.

The work proposed in (Wolff and Nordin, 2003)

presents the generation of robot legged locomotion in

ﬂat ground using LGP. A primary solution generated

in simulation would be passed on to a physical robot.

However, the achieved solution could not be executed

in the physical platform.

We intend to use GP to automatically search the

solution landscape and ﬁnd solutions that rely on a

set of motion primitives. We also explore the use of

feedback pathways as a means to enable adaptation

to the environment features, particularly to adapt the

locomotion to walk up and down slopes in the en-

vironment. We are particularly interested in the im-

pact of sensory inclusion in the robot behavior herein

assessed considering Center of Mass (CoM) trajec-

tory. This provides for an understanding of how

feedback enhanced the locomotion skills of a biped

robot. Results demonstrate the smooth locomotion

achieved by the proposed GP mechanism and the

added adaptability to the environment, provided by

the inclusion of feedback pathways directly onto the

controller. Therefore, movement is generated in en-

trainment with the environment.

The paper is organized as follows. The following

section presents the locomotion model used to con-

trol the target platform. Then in section 3 the GP evo-

lution mechanism is presented, where the individu-

als for the current evolution process and the evolution

conﬁguration are deﬁned. Lastly, the results are pre-

sented in section 4, followed by a discussion in sec-

tion 5 and conclusions and future work in section 6.

2 BIPED LOCOMOTION MODEL

The basis of the locomotion controller used in this

work was previously presented in (Matos and Santos,

2012), where we proposed a Central Pattern Gener-

ator (CPG) integrated with local sensory feedback,

ICINCO2014-11thInternationalConferenceonInformaticsinControl,AutomationandRobotics

capable of generating bipedal locomotor behaviors,

such as walking forward/backwards and turning.

The proposed CPG controls a single leg, divided

in rhythmic and unit motion pattern generators. The

use of a phase oscillator as a rhythmic generator al-

lowed for a simple contralateral coupling between the

left and right CPGs, maintaining the correct coordina-

tion between the generated locomotor trajectories in

both legs by producing each a driving rhythmic signal

in strict alternation (i for left/right leg).

Motion pattern generators receive this rhythmic

input and produce the corresponding joint trajectory,

z, in a synergistic approach of modular motion prim-

itives encoded as a set of non-linear dynamical equa-

tions with well deﬁned attractor dynamics, similarly

to other works (McSharry et al., 2003; Nakanishi

et al., 2004; Ijspeert et al., 2002). Basic parameterized

motion primitives, e.g. sine and bell-shaped motions

(implemented as Gaussians), were considered.

The joint angle value z

i, j

, for leg i and joint j, is

given by

˙z

i, j

= −α(z

i, j

− O

i, j

) +

∑



i, j

,φ



. (1)

The ﬁnal motor program in a single joint re-

sults from the sum of rhythmic motion primitives f

around a center point O

i, j

. α is a relaxation parameter

for the offset. j speciﬁes the joint: hip roll (hRoll), hip

yaw (hYaw), hip pitch (hPitch), knee (kPitch), ankle

roll (aRoll) and ankle pitch (aPitch); and i speciﬁes

the left or right leg.

This approach, despite simplistic, allowed a small

humanoid DARwIn-OP robot to perform stable loco-

motion (Matos and Santos, 2012). The simplicity of

the approach is also its weakness. For instance, if

more complex motor programs are desired we are un-

aware of which motion primitives f

should be em-

ployed. This problem motivates us to use automatic

optimization in the present work, where the motor

program is given as a result from GP evolution

˙z

i, j

= E

i, j



i, j

,φ



. (2)

The biped robot DARwIn-OP has 6 joints per leg,

hip roll, pitch and yaw, knee pitch and ankle pitch

and roll. Since the motor programs are valid for both

legs, and because we use the same motor programs of

the hip and knee joints in the ankle joints to maintain

the feet parallel to the ground at all times, we only

perform the search in four motor programs for the hip

roll, pitch and yaw, and for the knee pitch. This is

depicted in Fig. 1.

CPG

Motor

Program

HipYaw

Ankle

Roll

Ankle

Ptch

Motor

Program

Hip Roll

Motor

Program

Hip

Pitch

Motor

Program

Knee

Pitch

Figure 1: Schema for the control DarwinOP using only 4

motor programs (grey dark circles).

3 EVOLUTION

CONFIGURATION

The evolution process aims to optimize the locomo-

tion of a given platform, in this case the DarwinOP

biped robot. Different platforms could havebeen used

instead, under adequate conﬁguration. Additionally,

it is also desirable to provide adaptation to the loco-

motion through feedback pathways. Therefore, two

speciﬁc goals are deﬁned: 1) to improve the locomo-

tion efﬁciency for the target platform comparatively

to an initial hand-tuned solution by exploring differ-

ent motion primitives for each joint; and 2) to explore

the search and optimization of feedback pathways to

achieve adaptability to changing features of the envi-

ronment.

Two different controllers will be proposed in or-

der to address each of these goals: controller 1 and

controller 2. Controller 1 intends to generate biped

locomotion to a target platform, and evolves in open-

loop without including any sensory information from

the environment. Controller 2 intends to generate

biped locomotion to the same target platform but in a

closed-loop fashion. Therefore, sensory information

was directly included into the movement generation

in order to achieve adaptability to the environment.

This last controller should therefore provide for bet-

ter results when adaptability is required.

3.1 Individuals

Locomotion is generated according to four mathemat-

ical expressions E, eq. 2. Each individual, a candi-

date solution for the locomotion problem, is therefore

composed by 4 chromosomes that correspond to the

GeneticProgrammingAppliedtoBipedLocomotionControlwithSensoryInformation

four E equations, that will drive the robot joints, as

depicted in Fig. 1.

Each chromosome is a mathematical expression

under the form of a tree. The nodes of the tree are

deﬁned by functions whose branches correspond to

the inputs, and that can be either other nodes or just

leaves. The leaves are variables, constants and sen-

sory inputs, that will input the lower level tree nodes.

The function set is speciﬁed as:

= {sin,cos,∗,+, − , /,exp,sigmoid,step}. (3)

It is assumed that this set is sufﬁcient to cre-

ate rhythmic motions for the achievement of the

biped locomotion pattern. These functions were used

in (Matos and Santos, 2012) to implement the ba-

sic parameterized motions primitives, sine and bell-

shaped motions, to achieve biped locomotion.

Further, the functions sigmoid and step were in-

cluded as relevant functions for the control process.

The sigmoid function was deﬁned as follows:

The step function according to :

sigmoid(x) = 2

1+ exp

−x5

− 1 (4)

Both functions were though as interesting func-

tions to enable the interaction between functions and

between function and inputs (e.g. variables or sensory

inputs).

step(x) =







1 ifx > 0,

−1 ifx < 0.

0 otherwise,

(5)

The terminal set is deﬁned speciﬁcally for each of

the two proposed controllers.

3.1.1 Controller 1

For this controller the terminal set is deﬁned as fol-

lows:

φ,

φ,z, −

,−

,−π,

,π,[−60,60],0

(6)

where φ,

φ and z are the controller inputs as deﬁned

in section 2. The constants were chosen as angles and

real values thought to be relevant for the purpose. Al-

though the angles, such as

, are within the deﬁned

interval of real numbers, [−60,60], their speciﬁcation

in the terminal set, increases the probability of being

chosen. This is important for the locomotion genera-

tion due to their adequacyto deﬁne phase relations be-

tween the different joints. The real interval, [−60, 60],

was chosen as possible values for the amplitudes of

the different movements.

3.1.2 Controller 2

In order to be able to seek for adaptation to the en-

vironment such that locomotion is generated accord-

ingly, sensory feedback can be directly included in the

search space.

In this work, sensory feedback is provided by a

three axis accelerometer, a

, a

and a

; a three axis

gyroscope, g

, g

and g

; and the touch sensors of

each leg, t

and t

. Both the accelerometer and the

gyroscope values are normalized and then fed to the

controllers. The touch sensors indicate if a given leg

is in contact with the groundor not, yielding a boolean

value indicating the state.

The terminal set for evolutions is thus deﬁned as

follows:

= S

∪





(7)

3.2 Evaluation

The criterion used in the evaluation of the individuals

is the forward distance traveled by the robot during a

certain amount of time. Besides this, the ﬁtness takes

into account different and penalizable results as fol-

lows:

f =

∑

i=1

∆

(i)

− |∆

(i)

| − c

(i)

fall

(i)

fall

− c

nan

, (8)

where ∆

is the forward displacement and ∆

is the

lateral displacement. The lateral displacement is re-

moved from the forward displacement in order to

compensate for possible asymmetric sliding of the

platform. v

fall

and v

nan

are ﬂags that indicate if the

controller caused the robot to fall, or produced impos-

sible joint positions, respectively. The constants c

fall

and c

nan

, are the penalizing coefﬁcients for the corre-

sponding situations. The undesirable situations of the

robot falling or attempting an impossible joint posi-

tion, as well as the lateral displacement, ∆

, penalize

the corresponding individual’s ﬁtness.

The evaluation process of each individual is di-

vided into N stages. i indicates the stage. Each stage

yields different displacements and possible falls that

have to be averaged to count for the ﬁnal ﬁtness.

3.3 Architecture

The evolution process was implemented using the

OpenBeagle Framework, (Gagn´e and Parizeau, 2006)

as shown in Fig. 2. The initial population is generated

with individuals that are evaluated using the Webots

simulator, to guarantee that are feasible solutions, that

is, they do not generate impossible joint positions.

ICINCO2014-11thInternationalConferenceonInformaticsinControl,AutomationandRobotics

The scenario may vary and yields the ﬁtness values

accordingly. The ﬁtness of each individual is used by

the genetic operators, selection, crossover and muta-

tion, in the process of generating a new population.

The new population is then evaluated and the loop

goes on until the termination criterion is satisﬁed. In

the present work, the termination criterion is deﬁned

by a maximum number of generations.

Stop

Criteria

Webots

Fitness

Value

Initial Population

New Generation

Optimized Solution

Reached

Not Reached

Population

sin

Individual Chromossome

sin

...

Mutation

Cross Over

Selection

Each

Individual

sin

Each

Population

Figure 2: Evolution architecture. On the left, a new popu-

lation is generated. On the right, the evaluation process in

which each individual is tested in Webots simulator.

The initial population was speciﬁed using 10 seed

individuals, as variations of a hand-tuned solution

from our previous work in (Matos and Santos, 2012).

These seed solutions were deﬁned by small variations

in the parameters of eq. 1, whether in the motion

primitives, or in the offsets of the joints (O

i, j

The remaining individuals were generated ran-

domly using the ramped Half-and-Half method -

where half the individuals are randomly generated

trees until a variable depth is reached, and the other

half until a variable size is reached. All share the

same structure so that their information will be spread

through the population during the evolution process,

and coupled with different structures to generate nov-

elty and diversity.

Other parameters of the evolution process conﬁg-

uration are listed in table 1. The crossover and muta-

tion rates, as well as the selection sizes, were selected

by trial and error, so that the evolution process yields

optimized solutions.

The penalizing coefﬁcient c

nan

= 10, so that the

genetic information that generates impossible joint

positions is quickly discarded by the evolutionarypro-

cess.

The number of stages, N, as well as the penalizing

coefﬁcient, c

fall

, change for the different developed

controllers. Therefore they will be speciﬁed in the

text when required.

Table 1: GP evolution parameters.

Parameter Value

Tournament Size 10

Cross over Rate 0.8

Mutation Rate 0.3

Population Size 500

# Generations 100

Max Depth 25

4 RESULTS

This section intends to show the obtained results for

the two developed controllers.

Firstly, we demonstrate the adequacy of both con-

trollers to produce different walking locomotion with

improved performance over the initial hand-tuned

one, according to the speciﬁed criteria. Secondly, we

explore the inclusion of sensory feedback pathways

as a means to provide for adaptability and as such

achieve locomotion with better performance when

climbing and descending slopes.

The evaluation of both controllers is performed in

three different scenarios. Flat Ground experiment is

the simplest scenario. Slope Ground experiment is a

scenario in which the robot has to climb or to descend

a sloped ground. Up-Slope and Down-slope Ground

experiment is a scenario in which the same solution

has to cope with up and down slopes.

During the ﬁrst 10 s of an individual evaluation,

the robot movements are linearly increased from an

initial posture up to the speciﬁed values. This pro-

vides for a smooth and stable slow start of the robot’s

locomotion. At the end of each individual evaluation,

the robot is set to its initial position and rotation, such

that initial conditions are equal for the evaluation of

all individuals of all populations.

Results were obtained in an Intel i7-2600k 3.4Ghz

Linux (8 GB of RAM) PC.

4.1 Flat Ground Scenario

In this scenario, the goal of the evolved solutions is

to generate a controller that enables the robot to go

as further as it can, while drifting laterally as little as

possible, during 30 s.

The robot always moves in the same ﬂat ground.

Thus only one stage is considered (N = 1). The fall

penalty factor is set as c

fall

= 1, so that the evolu-

tion process is forced to select individuals that do not

cause the robot to fall, over the ones that do.

Fig. 3 depicts the forward and lateral displace-

ments achieved for the 10 seed individuals (star

GeneticProgrammingAppliedtoBipedLocomotionControlwithSensoryInformation

marker) and the best 4 solutions out of 4 indepen-

dent evolutions for controller 1 (square marker) and

controller 2 (triangle marker). Each evolution took

approximately 30 hours to simulate.

−0.5 0 0.5 1 1.5 2

−0.2

−0.1

0.1

0.2

0.3

∆

Controller 1

Controller 2

Seeds

Controller 1

Best Solution

Controller 2

Best Solution

(m)

Figure 3: Displacements achieved by 10 seed individu-

als (magenta stars) and 4 solutions of controller 1 (blue

squares) and controller 2 (red triangles), in ﬂat ground. Ini-

tial position is (0,0).

The forward displacement, ∆

, achieved by the

evolved solutions of both controllers was far better

than the one achieved by the seed individuals. The

lateral displacement, ∆

, varied within the solutions.

The best solution was the one named of Best con-

troller 1 and highlighted in Fig. 3, with ∆

≈ 2 cm and

∆

≈ 1.75 m. Comparatively to the best hand-tunned

one with ∆

≈ 3.5 cm and ∆

≈ 38 cm it improved

360% in z and 43% in x.

In the overall both controllers presented similar

displacements. This suggests that the inclusion of

feedback in the evolution process further increases the

complexity, and that in case there is no explicit need

for adaptation, that complexity could be discarded.

Fig. 4 shows the evolution of the ﬁtness function

values of the best solution for controller 1 and con-

troller 2. This ﬁgure allows to see the variation and

the degree of learning of the best solution for con-

trollers. It is possible to observe that the increasing

of the ﬁtness is slightly faster for controller 2 when

compared with controller 1. This seems to indicate

that the inclusion of feedback can initially speedup

the search. However, the ﬁtness of the best solutions

for both controllers tends to nearly identical values in

the end of the search.

0 10 20 30 40 50 60 70 80 90 100

0.5

1.5

Generations

Fitness

Controller 1 Best

Controller 2 Best

Figure 4: Fitness values for best solutions of controller 1

(solid blue line) and controller 2 (dashed read line) in ﬂat

ground.

Fig. 5 shows the Center of Mass (CoM) trajectory

(red solid line) alongside with the feet position (blue

polygons) during the evaluation procedure, for the so-

lutions highlighted in Fig. 3 as Best controller 1 (top)

and Best controller 2 (bottom). We can observe that

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

−0.1

−0.05

0.05

∆

(m)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

−0.05

0.05

0.1

∆

(m)

Figure 5: Trajectories of CoM (solid red line) and feet po-

sitions (blue polygons) in ﬂat ground for controller 1 (top)

and controller 2 (bottom).

lateral displacement is larger for the controller 2 solu-

tion and makes the robot turn left. However, the inclu-

sion of sensory information resulted in a larger excur-

sion of the CoM trajectory. It is now projected within

the bounds of the robot feet thus generating a more

stable locomotion. Further, the robot weight moves

towards the center of the foot which facilitates ver-

tical clearance of the unloaded leg during the swing

phase of the step. The result is a locomotion more en-

trained with the robot dynamics and the environment

conditions.

4.2 Adaptation to Slopes

This experiment is intended to verify the adequacy of

the proposed controllers to adjust the generated tra-

jectories to the current environment, sensed by the de-

scribed sensors. This adaptation to environmentalfea-

tures was evaluated considering slopes, that the robot

is expected to climb and descend.

In order to obtain a path with smooth up and down

slopes, these were generated using a sinusoidal func-

tion. The slope for each evaluation task are shown in

Fig. 6, a and b, for the climb and descending tasks, re-

spectively. Each tasks is composed of a single stage,

N = 1. During the evolution, the robot starts 10 cm

away from the slope, both when it will climb or de-

scend. Then, it advances and needs to adapt to the

changing inclination level, whose maximum value is

9.8 degrees, midway through the slope. This corre-

sponds to a maximum height of 5.5 cm and an exten-

sion slope of 50 cm.

From the starting point to the end of the slope

(in both cases of climbing and descending), the robot

needs to walk over 60 cm. As in preliminary evolu-

tion tests the achieved velocities in such conditions

ICINCO2014-11thInternationalConferenceonInformaticsinControl,AutomationandRobotics

were lower than the ones on ﬂat ground, the evalua-

tion time was increased to 40 s, instead of the previous

30 s. The initial solutions to this task were bound to

fall over, since the seed had no a priori knowledge

of how to adapt the locomotion to the slope. There-

fore, the fall penalty factor was set as c

fall

= 0.2. This

way, the controllers that caused the robot to fall were

still penalized, but those that were able to reach the

slope and only fell while in the slope, could persist in

the populations and generate solutions that adapt the

locomotion to the slope.

0 0.2 0.4 0.6 0.8 1

0.05

0.1

∆

Height

0 0.2 0.4 0.6 0.8 1

0.05

0.1

∆

5.5 cm

10 cm

50 cm

Figure 6: Up-Slope (a) and Down-slope (b) Ground stages.

In these scenarios, both controllers were evolved

two times. Each of which took approximately 35

hours to simulate.

The results are presented in Fig. 7, top and bot-

tom, for the up and down slopes, respectively. The

achieved ∆z displacements are similar for controller

1 (square markers) and controller 2 (triangle mark-

ers) in the up slope. However, in the down slope,

controller 1 seemed to achieve a slightly bigger ∆z

displacement. Also, the climbing seems to be harder

than descending since ∆z is much smaller.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

−0.04

−0.02

0.02

0.04

0.06

∆

Controller 1

Controller 2

(m)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

−0.01

0.01

0.02

0.03

0.04

∆

Controller 1

Controller 2

(m)

Figure 7: Forward and lateral displacements when the robot

climbs (top) and descends (bottom) the slopes, for con-

troller 1 (square markers) and controller 2 (triangle mark-

ers).

All solutions were tested in the converse scenario,

i.e., if they were evolved to climb the slope, they were

tested to descend it, and vice-versa. In all those cases,

the robot fell. This suggests that the achieved adap-

tation to the slope did enable the robot to fully adapt

only to the speciﬁc environment conditions. This de-

notes a static adaptation for a speciﬁc task. However,

a solution that is able to climb (descend) a speciﬁc

slope is also able to climb (descend) smaller slopes.

This denotes a certain adaptability to the environment

conditions.

4.3 Up-Slope and Down-slope Ground

In this scenario, the robot has both to climb and de-

scend smooth slopes. Thus, two stages, N = 2, are

considered. Firstly the robot attempts to perform the

climbing stage, secondly, it is placed and reseted in

order to attempt the descending stage. By forcing the

robot to be able to face both these stages, the con-

troller needs to adapt the locomotion to the slope,

rather then statically adapt the robot’s posture during

locomotion. The goal is to verify if sensory inclusion

provides for the required ability to adapt the generated

locomotion to the environment, as well as to compare

the results of both approaches. Further, we are inter-

ested in verifying the impact of sensory inclusion on

the robot behavior.

In this scenario, the evolution stop criteria for each

stage is 40 s and/or a maximum forward displacement

of ∆

max

= 0.8 m. Thus, at most both stages take 80 s.

This way, the over specialization on one of the stages

is prevented. The fall penalty factor is set to c

fall

0.2.

Three evolutions were performed for both the pro-

posed controllers. Each took approximately 65 hours

to simulate.

The obtained results are listed in table 2. We can

observe that the solutions found without the inclusion

of sensory inputs (Controller 1) were not able to ful-

ﬁll the desired task. The robot fell during the climb-

ing stage. On the other hand, Controller 2 solutions

were all able to both climb and descend the slopes.

These results show that the inclusion of feedback en-

abled the robot to perform the task in both stages, thus

adapting the locomotion as required.

Figs. 8 and 9 show the CoM trajectory (red solid

line) alongside with the feet position (blue polygons)

during the climbing (top) and the descending (bottom)

stages for Controller 1 and Controller 2, respectively,

for the highlighted solutions in table 2. These are

the solutions with higher ﬁtness for both controllers.

This simple functional gait analysis shows that be-

sides ﬁnding solutions that do not fall, Controller 2

was able to ﬁnd solutions in which the CoM move-

ment smoothly oscillates between the center of both

feet with less abrupt oscillations. Again, this shows

up that the proposed framework managed to enhance

GeneticProgrammingAppliedtoBipedLocomotionControlwithSensoryInformation

Table 2: Relevant information for Up-Slope and Down-slope Ground stages for both controllers. Distances are expressed in

meters.

Up Down f Disabled

feedback

Full

slope

Controller 1 ∆

: fall

∆

: fall

∆

: 0.8

∆

: 0.03

0.29 - no

∆

: fall

∆

: fall

∆

: 0.8

∆

: 0.11

0.27 - no

∆

: fall

∆

: fall

∆

: 0.61

∆

: 0.24

0.1 - no

Controller 2 ∆

: 0.75

∆

: -0.08

∆

: 0.77

∆

: 0.1

0.67 no yes

∆z: 0.6

∆x: 0.05

∆z: 0.65

∆x: 0.07

0.56 no yes

∆

: 0.61

∆

: 0.09

∆

: 0.64

∆

: 0.07

0.54 no no

0 0.05 0.1 0.15 0.2 0.25

−0.05

0.05

0.1

0.15

∆

(m)

0. 0.8 0.9 1 1.1 1.2 1.3

3.95

4.05

4.1

∆

(m)

Figure 8: Trajectories of CoM (solid red line) and feet po-

sitions (blue polygons) during the climbing (top) and the

descending (bottom) stages of Controller 1 with highest ﬁt-

ness.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

−0.15

−0.1

−0.05

0.05

∆

(m)

0.7 0.8 0.9 1 1.1 1.2 1.3 1.4

3.95

4.05

4.1

∆

(m)

Figure 9: Similar to Fig. 8 but for Controller 2.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

−0.3

−0.2

−0.1

∆

(m)

Figure 10: Trajectory of Controller 2 solution with higher

ﬁtness in the full slope scenario.

the locomotion skills of the biped robot.

A relevant question emerges. What if the sensory

feedback pathways were set to zero in the determined

Controller 2 solutions when performing the up-down

slope? This would show the need for feedback inclu-

sion in the controller, when driving the robot through

the deﬁned task.

The obtained results are presented in table 2 in

the column labeled Disabled feedback. An yes indi-

cates if the solution was still able to walk over the

scenario. In all cases feedback was mandatory for the

Controller 2 solutions to perform both tasks.

The inclination level used in this work, was out of

the reach of Controller 1 solutions, but it is achievable

through the use of feedback pathways. All found so-

lutions were also able to perform locomotion in slopes

with lower inclination values.

In order to verify the solutions’ generality a dif-

ferent scenario was used. During 80 s, the robot is

faced with a complete sine curve, a down slope im-

mediately followed by an up slope, in a continuous

fashion. This scenario is intended to verify if the gen-

erated solutions are able to cope with the overall path.

It requires the controller to continuously adapt the lo-

comotion as it progresses through the slope.

The achieved results are listed in the last column

of table 2, labeled Full Slope, for each of the Con-

troller 2 solutions. An yes indicates if the solution

was able to walk over the scenario. Note that only one

solution was not able to walk over the overall path.

Fig. 10 shows the achieved trajectories for the

Controller 2 solution with highest ﬁtness (highlighted

in table 2). Despite a slight ﬁnal lateral displace-

ment, it successfully achieved a forward displacement

∆

≈ 1.7 m.

ICINCO2014-11thInternationalConferenceonInformaticsinControl,AutomationandRobotics

5 DISCUSSION

In case of ﬂat ground, the use of feedback pathways

showed no speciﬁc advantage, since the solutions

were very similar. These results were the expected

ones, since there was no explicit need for feedback,

and the inclusion of sensory feedback in the search

space further increased the complexity of the opti-

mization problem. However, considering other met-

rics for the performance of the generated movement,

such as the movement of the COM, one can see that

robot motion seems to be more entrained with the en-

vironment and the robot model when in closed-loop.

This needs more attention in future work.

Evolution results in the slope scenarios demon-

strated the ability to generate solutions capable to

climb and descend slopes, both with and without the

inclusion of feedback inputs. However, the achieved

solutions could only perform in the tasks for which

they were evolved to. Otherwise, the robot fell. For

instance, the solutions that were evolved to climb the

slope would fall if tried to descend that same slope.

These results suggest that the robot’s posture was

adapted to the slopes during evolution, rather than its

locomotion generation, as it goes through the slope.

In order to prevent this from happening, a different

scenario composed of up and down stages was eval-

uated, in which each solution evolved in both stages.

The results showed that only the solutions with feed-

back inclusion achieved adaptation to the environ-

ment in both stages. When no feedback was consid-

ered the robot fell during the climbing stage. There-

fore, the use of feedback was required in order to en-

able adaptation to the ground’s slope.

More importantly was to verify the impact that

sensory information inclusion brings to the robot per-

formance and how it enhances the locomotion skills

of the biped robot. This was assessed through a sim-

ple functional gait analysis considering CoM trajecto-

ries. The resultant CoM trajectory was enlarged and

smoothly oscillated between the center of both feet.

Thus, the generated locomotion was more stable and

thus entrained with the robot dynamics and involving

environment.

6 CONCLUSIONS

In this paper, we have proposed a gait optimization

system for a biped robot using GP. Further, we ex-

plore the inclusion of feedback pathways to achieve

locomotion adaptation to the environment. We based

this work in (Matos and Santos, 2012), in which a

CPG based solution using a combination of a small

set of motion primitives was able to generate biped

walking for the DARwIn-OP humanoid robot.

Two controllers were developed. Controller 1

generates biped locomotion for a target platform, and

evolves in open-loop without including any sensory

information from the environment. Controller 2 gen-

erates biped locomotion for the same target platform

but in a closed-loop fashion. Therefore, sensory in-

formation was directly included into the movement

generation by the GP evolution process, in order to

achieve adaptability to the environment.

The obtained results have shown the adequacy of

both controllers to produce different walking locomo-

tion with improved performance over the initial hand-

tuned one, according to the speciﬁed criteria. In fact,

the achieved displacement was up to four times larger.

Further, the inclusion of sensory feedback pathways

provided the required adaptability to achieve locomo-

tion with better performance when climbing and de-

scending slopes. These solutions were tested against

the disabling of the feedback (replacing by a null

value), and against scenarios not used in the evolution

stages, demonstrating the generalization of the solu-

tion. The solutions were able to continuously adapt to

the changing inclination of a complete sine curve like

slope, and also were able to walk over slopes of lower

inclination levels.

In the overall, the obtained results emphasize

the fact that the inclusion of feedback has enabled

a smoother locomotion, with less undesired oscilla-

tions. The obtained locomotion was more entrained

with the robot dynamics and the involving environ-

ment. Future work includes to extend this functional

gait analysis to quantify how the proposed frame-

work managed to enhance the locomotion skills of the

biped robot.

Additionally, experiments were performed on a

physical platform. However, a great difference in the

physical setup caused the robot to fall. Such problem

is often referred to as Reality Gap. As future work,

this problem will be addressed similarly to the pro-

posed approach in (Koos et al., 2010), so as to pro-

vide an efﬁcient locomotion controller for a real Dar-

winOP robot.

ACKNOWLEDGEMENTS

This work has been supported by FCT – Fundac¸˜ao

para a Ciˆencia e Tecnologia within the Project Scope

PEst OE/EEI/UI0319/2014.

GeneticProgrammingAppliedtoBipedLocomotionControlwithSensoryInformation

REFERENCES

Andersson, B., Svensson, P., Nordahl, M., and Nordin, P.

(2000). On-line evolution of control for a four-legged

robot using genetic programming. Real-World Appli-

cations of Evolutionary Computing, pages 322–329.

Gagn´e, C. and Parizeau, M. (2006). Open beagle: a c++

framework for your favorite evolutionary algorithm.

SIGEVOlution, 1:12–15.

Gritz, L. and Hahn, J. (1997). Genetic programming evo-

lution of controllers for 3-d character animation. Ge-

netic Programming, 97.

Ijspeert, A. J. (2008). 2008 special issue: Central pat-

tern generators for locomotion control in animals and

robots: A review. Neural Networks, 21(4):642–653.

Ijspeert, A. J., Nakanishi, J., and Schaal, S. (2002). Learn-

ing Rhythmic Movements by Demonstration Using

Nonlinear Oscillators. In In Proceedings of the

IEEE/RSJ Int. Conference on Intelligent Robots and

Systems (IROS2002), volume 2002, pages 958–963.

Kim, J.-J., Lee, J.-W., and Lee, J.-J. (2009). Central pattern

generator parameter search for a biped walking robot

using nonparametric estimation based particle swarm

optimization. International Journal of Control, Au-

tomation and Systems, 7(3):447–457.

Kim, Y., Tagawa, Y., Obinata, G., and Hase, K. (2011).

Robust control of cpg-based 3d neuromusculoskeletal

walking model. Biological cybernetics, pages 1–14.

Koos, S., Mouret, J.-B., and Doncieux, S. (2010). Cross-

ing the reality gap in evolutionary robotics by pro-

moting transferable controllers. In Proceedings of the

12th annual conference on Genetic and evolutionary

computation, GECCO ’10, pages 119–126, New York,

NY, USA. ACM.

Matos, V. and Santos, C. P. (2012). Central pattern gen-

erators with phase regulation for the control of hu-

manoid locomotion. Business Innovation Center Os-

aka, Japan.

McSharry, P., Clifford, G., Tarassenko, L., and Smith, L.

(2003). A dynamical model for generating synthetic

electrocardiogram signals. Biomedical Engineering,

IEEE Transactions on, 50(3):289 –294.

Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal,

S., and Kawato, M. (2004). Learning from demon-

stration and adaptation of biped locomotion. Rob. and

Aut. Systems, 47(2-3):79 – 91.

Ok, S. and Kim, D. (2005). Evolution of the cpg with sen-

sory feedback for bipedal locomotion. Advances in

Natural Computation, pages 428–428.

Ok, S., Miyashita, K., and Hase, K. (2001). Evolving

bipedal locomotion with genetic programming-a pre-

liminary report. In Evolutionary Computation, 2001.

Proceedings of the 2001 Congress on, volume 2,

pages 1025–1032. IEEE.

Sugimoto, N. and Morimoto, J. (2011). Phase-dependent

trajectory optimization for cpg-based biped walking

using path integral reinforcement learning. In Hu-

manoid Robots (Humanoids), 2011 11th IEEE-RAS

International Conference on, pages 255–260. IEEE.

Tanev, I., Ray, T., and Buller, A. (2005). Automated

evolutionary design, robustness, and adaptation of

sidewinding locomotion of a simulated snake-like

robot. Robotics, IEEE Transactions on, 21(4):632–

645.

Vukobratovi´c, M. and Borovac, B. (2004). Zero-moment

point—thirty ﬁve years of its life. International Jour-

nal of Humanoid Robotics, 1(01):157–173.

Wolff, K. and Nordin, P. (2003). Learning biped locomo-

tion from ﬁrst principles on a simulated humanoid

robot using linear genetic programming. In Ge-

netic and Evolutionary Computation—GECCO 2003,

pages 199–199. Springer.

Wolff, K. and Wahde, M. (2007). Evolution of biped loco-

motion using linear genetic programming. Climbing

and Walking Robots, Towards New Applications.

ICINCO2014-11thInternationalConferenceonInformaticsinControl,AutomationandRobotics