STRATEGY BASED ON MACHINE LEARNING FOR THE

CONTROL OF A RIGID FORMATION IN A MULTI-ROBOTS

FRAME

Ting Wang, Christophe Sabourin and Kurosh Madani

Signals, Images, and Intelligent Systems Laboratory (LISSI / EA 3956)

Paris Est University, Senart Institute of Technology, Avenue Pierre Point, 77127 Lieusaint, France

Keywords:

Multi-robot systems, Formation control, Learning and adaptive Systems, Intelligent logistic application.

Abstract:

Many applications can beneﬁt from multi-robot systems like warehouse management, industrial assembling,

military applications, daily tasks. In this paper, we describe a new approach for the control of a formation of

robots. In the proposed solution, we consider the formation as a single robot and our work focus on how to

control the formation. We suppose there are virtual rigid links between all robots and all robots perform the

same task in synchronous manner.

1 INTRODUCTION

Today, and in the future, many applications, like

warehouse management, industrial assembling, mil-

itary applications, daily tasks, could beneﬁt from

multi-robot systems (Parker, 2008), (Cao et al., 1997).

However, the design of a control strategy for the

multi-robot systems needs cooperation and coordina-

tion between all robots. In this context, one of goals of

our researches is to design control strategies for multi-

robot systems mainly for industrial applications. For

example, multi-robot systems are used in logistic ap-

plication (Wurman et al., 2008), where a lot of small

robots are used transport some objects. This approach

seems very interesting but it has some limitations. All

of robots have individual behaviors and all of robots

are controlled by a supervisor. The goal of this paper

is to present our ﬁrst investigationin the domain of the

multi-robot systems for logistic applicationsand more

especially for collaborations between several robots

carrying a load.

The proposed work is very close from studies

about formation control of robots. But generally, in

all previous publications about formation control of

robots, researches focused on the control of all robots

in order to maintain the formation (for expamle (Mas-

tellone et al., 2008) (Barfoot and Clark, 2004)). In this

work, we focus on how to control the formation and

we consider the formation like a single robot. Fur-

thermore, we suppose there is some virtual rigid links

between all robots and that all robots can perform the

same task in a synchronous manner. In addition, in

order to use our approach in real time, we propose

a solution based on the image processing and a ma-

chine learning. As result, we show that it is possible

to move a rigid formation of robots in a constraint en-

vironment.

The reminder of this paper is organizedas follows.

In section 2, we describe the proposed approach and

mainly we expose the solution that we used to control

the formation. The learning process used to compute

the path planning is outlined in detail in section 3. The

simulation results have showed in section 4. Section

5 gives conclusion and presents further works.

2 CONTROL STRATEGY FOR

THE ROBOTS’ FORMATION

The use of a multi-robot systems to transport bulky

objects is an elegant solution for this kind of prob-

lem and that is a very ﬂexible solution. Effectively,

sometimes this task needs to design speciﬁc vehicles

according to some constraints which come from the

object to carry.

In this paper, and without any loss of generality,

we will consider only a formation with three robots

(see Fig.1). Furthermore, it must be pointed out that

in this work we focus only on the high level con-

trol. And we consider that there is a low level control

which is able to maintain the rigidity in the forma-

300

Wang T., Sabourin C. and Madani K..

STRATEGY BASED ON MACHINE LEARNING FOR THE CONTROL OF A RIGID FORMATION IN A MULTI-ROBOTS FRAME.

DOI: 10.5220/0003535503000303

In Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2011), pages 300-303

ISBN: 978-989-8425-75-1

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: Schematic description of a rigid formation with

three wheeled robots. The relative distance between two

robots is a constant value and all robots are the same orien-

tation (ψ = 0

◦

ﬁgure (a), ψ = 45

◦

ﬁgure (b)).

tion of robots. This formation is formed by wheeled

robots moving in a plane. Each robot may be one

nonholonomerobot (e.g. unicycle-type mobile) or an-

other kinds (e.g. omni-directional).

The modeling used to describe the formation (rep-

resented on the Fig.1) is based on the following con-

cepts:

• There is a reference robot in the formation (e.g.

robot 1 on the Fig.1). The both position and ori-

entation of this robot corresponds to the both posi-

tion and orientation of the formation according to

an absolute frame. It must be pointed out that the

reference robot is not necessarily a leader robot.

• The position of the other robots i are deﬁned ac-

cording to the position of robots i-1. Two parame-

ters are used to specify this position. There are the

relative distance between two robots (e.g. l

and

on the Fig.1) and the absolute angle between

the orientation of the robot and a normal direction

(e.g. θ

and θ

on the Fig.1).

• The formation is rigid which means that the rel-

ative distance between two robots is a constant

value and all robots have the same orientation. It

must be noticed that the orientation of formation

(ψ) is independent of the angle φ used to describe

the relative position between two robots (ψ = 0

◦

Fig.1(a), ψ = 45

◦

Fig.1(b) ).

The state of the formation is given by the posi-

tion of the reference robots (e.g. the robot number

1) and the orientation of the formation. We deﬁne 8

possible orientations (see Fig.2) dependent directly of

the angle θ: θ = 0

◦

for the formation 1, θ = 45

◦

for

the formation 2, θ = 90

◦

for the formation 3, and so

on. Consequently, this orientation is independent of

the angle ψ. For each orientation, we deﬁne height

possible actions (from 1 to 8) (see Figure 3). These

Figure 2: Schematic description of the 8 formations.

robot

Figure 3: Actions used to control the formation.

actions correspond to the directions (angle ψ) allow-

ing to move the formation of robots. Two other ac-

tions allow to do a rotation of the formation (Action 9

and 10 are the clockwise and anti clockwise rotation

action respectively). Because we consider nonholo-

nomic robots, if the rigid formation needs to move in

a desired direction (for example formation 7 and ac-

tion 3), each robot in the formation simultaneously

rotates to the desired direction by using an orientation

control and after go forwards in the desired direction.

3 PATH PLANNING FOR A RIGID

FORMATION OF ROBOTS

Based on the previous description of our control strat-

egy, the problem is now how to compute the path

planning (to ﬁnd action like ”forwards”, ”left”, and

so on) to move the formation from an initial point

to a goal point in a constraint environment (obstacle

avoidance, narrow path, etc..). In order to design on-

line approach for real-time application, we have based

our concept on two levels. The ﬁrst one allows us

to get the equivalent quantiﬁed environment, and the

second carry out a learning process in order to ﬁnd the

set of the best actions.

3.1 Image Processing

In order to get automatically numerical information

allowing to represent environment, we have devel-

oped an approach based on image processing. This

procedure uses the following process:

STRATEGY BASED ON MACHINE LEARNING FOR THE CONTROL OF A RIGID FORMATION IN A

MULTI-ROBOTS FRAME

301

Figure 4: Picture of the virtual environment given by a vir-

tual camera located at the top of environment.

• Taking the photo of the environment. It should be

noticed that in order to simplify, in the ﬁrst time,

we consider we have a virtual camera located at

the top of environment (see for example Fig.4).

• Modify the RGB image to the gray scale image,

and convert the gray image into a binary image

with a suitable threshold value.

• The last step allows to describe environment by a

binary matrix, in which 1 represents obstacles and

0 is free path.

After the image processing, it is possible to get the

equivalent quantiﬁed environment (see Fig.5 ). In this

example, the environment is divided into 100 states

where each state is a square with sides 30cm. For the

quantiﬁed environment, the obstacles are marked with

red stars. The formation is composed with a reference

robot (the robot 1 represented like a solid black dot on

the Fig.5) and other robots (robot 2 and 3 represented

like black circle). In initial position, the robots stand

in the position A which is in the state [6;10]. The ﬁnal

state position B is in state [4;10].

Figure 5: Equivalent quantiﬁed environment representation:

the obstacles are marked with red stars, the reference robot

is represented like a solid black dot and other robots repre-

sented like black circle.

3.2 Q Learning

Now, our aim is to ﬁnd a solution allowing to move

the three robots from point A to point B in conserving

a virtual rigid formation (a line in this case). As has

been noted in the section 3.1, the environment may be

described by a matrix E(10× 10). This matrix con-

tains information about the position of the obstacles.

But, it is not sufﬁcient to describe the full state S. Ef-

fectively, one state should be composed of two infor-

mation which are the position of the reference robot,

and the kind of the formation. Consequently, the size

of the set S is equal to 10× 10× 8

On the base of the last description of the state S,

and being given that we consider the formation like

only one robot, it is possible to look for the succession

of actions in order to move the formation from the

point A to another point B. To solve this problem, one

solution consists to use a reinforcement learning. The

goal of the reinforcement learning algorithm is to ﬁnd

the action which maximizes a reinforcement signal.

The reinforcementsignal provides an indication of the

interest of last chosen actions. Q-Learning, proposed

by Watkins (Sutton and Barto, 1998), is a good way

to use reinforcement learning strategy.

4 SIMULATION

In this section, we present results of simulation for the

example described in the section 3. The frame of this

simulated environment is composed of three robots

KheperaIII

and a top camera. Simulations have been

performed by using software Webots

and controllers

have been designed with the software Matlab

As described in the previous section, the world is

a square with 3 meters sides. And it is divided into

100 small squares, where each squares (0.3x0.3 m)

represents possible position of one robot. Concerning

the KheperaIII, we suppose that robots are always lo-

cated in the center of the square. And we suppose that

standard robot 1 permanently moves from the posi-

tion of one center of the square to another. The other

two robots rotate around the standard robot so as to

change the formation. The radii are 0.3m and 0.6m

respectively by taking the reference robot as the ref-

erence frame. Throughout the whole process, robots

run the same operation synchronously. When the path

planning is computed, then it is possible to compute

the reference trajectories for the robots of the forma-

tion. It is be noticed that as we use nonholonomic

www.k-team.com/

www.cyberbotics.com/

www.mathworks.com/

ICINCO 2011 - 8th International Conference on Informatics in Control, Automation and Robotics

302

Figure 6: Snapshot of simulation’s results. The virtual envi-

ronment is composed of three robots KheperaIII which have

to move by avoiding obstacles but with a rigid formation.

robots, trajectories are decomposed into rotations and

linear motions. Fig.6 shows snapshot of simulation’s

results. As depicted on the pictures, we can observe

that the three robots move from the initial point to the

ﬁnal point in keeping the rigid formation. The possi-

ble actions (from 1 to 10) are chosen according to the

environment constraints.

The Fig.6(a) shows the initial position of robots,

and their formation (formation 7). After a series of

translation actions (action 6), robots arrive in the po-

sition showing by Fig.6(b). Then, robots must change

their formation in order to avoid obstacles. Fig.6 (c)

and Fig.6 (d) show the rotation actions. Then robots

with formation 5 (in vertical) move to the upper left

by three translation actions. The beginning and ﬁnal

position are Fig.6 (e) and Fig.6 (f). After that, both the

translation actions and rotation actions have done, and

the two snapshot of the motion in this stage are Fig.6

(g) and Fig.6 (h). At the end of the path, robots ro-

tate back to horizontal orientation (with formation 3)

but on the opposite to the initial formation 7, in which

robot 1 is on the right of the formation. And robots

translate again along the upper left till they arrive to

the top left of the environment. The ﬁnal position is

in Fig.6 (i) with formation 3.

5 CONCLUSIONS

In this paper, we have described a new approach for

the control of a rigid formation of robots in the frame

of logistic applications. In the proposed solution, we

have considered the formation as a single robot and

our work has focused on how to control the forma-

tion. To use our approach in real-time, we have pro-

posed a solution based on the image processing and

a machine learning. As result, we have shown that it

was possible to move a rigid formation of robots be-

tween two points in a constraint environment. Some

results about simulation have proven the efﬁciency of

the proposed method.

Future works will focus on the improvement of

the proposed method, namely on the collaboration be-

tween all agents. And we will investigate realistic

problems in the domain of industrial application.

ACKNOWLEDGEMENTS

This work was supported by a doctoral fellowship

from the China Scholarship Council (CSC). Authors

wish to express their gratitude to CSC.

REFERENCES

Barfoot, T. D. and Clark, C. M. (2004). Motion planning

for formations of mobile robots. Robotics and Au-

tonomous Systems, 46(2):65 – 78.

Cao, Y. U., Fukunaga, A. S., and Kahng, A. (1997). Coop-

erative mobile robotics: Antecedents and directions.

Autonomous Robots, 4:7–27.

Mastellone, S., Stipanovic, D. M., Graunke, C. R.,

Intlekofer, K. A., and Spong, M. W. (2008). Forma-

tion control and collision avoidance for multi-agent

non-holonomic systems: Theory and experiments. I.

J. Robotic Res., 27(1):107–126.

Parker, L. E. (2008). Multiple mobile robot systems. In

Siciliano, B. and Khatib, O., editors, Springer Hand-

book of Robotics, pages 921–941. Springer Berlin

Heidelberg.

Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learn-

ing: An Introduction. The MIT Press.

Wurman, P. R., Andrea, R. D., and Mountz, M. (2008). Co-

ordinating hundreds of cooperative, autonomous vehi-

cles in warehouses. AI Magazine, 29(1):9–20.

STRATEGY BASED ON MACHINE LEARNING FOR THE CONTROL OF A RIGID FORMATION IN A

MULTI-ROBOTS FRAME

303