STRATEGY BASED ON MACHINE LEARNING FOR THE
CONTROL OF A RIGID FORMATION IN A MULTI-ROBOTS
FRAME
Ting Wang, Christophe Sabourin and Kurosh Madani
Signals, Images, and Intelligent Systems Laboratory (LISSI / EA 3956)
Paris Est University, Senart Institute of Technology, Avenue Pierre Point, 77127 Lieusaint, France
Keywords:
Multi-robot systems, Formation control, Learning and adaptive Systems, Intelligent logistic application.
Abstract:
Many applications can benefit from multi-robot systems like warehouse management, industrial assembling,
military applications, daily tasks. In this paper, we describe a new approach for the control of a formation of
robots. In the proposed solution, we consider the formation as a single robot and our work focus on how to
control the formation. We suppose there are virtual rigid links between all robots and all robots perform the
same task in synchronous manner.
1 INTRODUCTION
Today, and in the future, many applications, like
warehouse management, industrial assembling, mil-
itary applications, daily tasks, could benefit from
multi-robot systems (Parker, 2008), (Cao et al., 1997).
However, the design of a control strategy for the
multi-robot systems needs cooperation and coordina-
tion between all robots. In this context, one of goals of
our researches is to design control strategies for multi-
robot systems mainly for industrial applications. For
example, multi-robot systems are used in logistic ap-
plication (Wurman et al., 2008), where a lot of small
robots are used transport some objects. This approach
seems very interesting but it has some limitations. All
of robots have individual behaviors and all of robots
are controlled by a supervisor. The goal of this paper
is to present our first investigationin the domain of the
multi-robot systems for logistic applicationsand more
especially for collaborations between several robots
carrying a load.
The proposed work is very close from studies
about formation control of robots. But generally, in
all previous publications about formation control of
robots, researches focused on the control of all robots
in order to maintain the formation (for expamle (Mas-
tellone et al., 2008) (Barfoot and Clark, 2004)). In this
work, we focus on how to control the formation and
we consider the formation like a single robot. Fur-
thermore, we suppose there is some virtual rigid links
between all robots and that all robots can perform the
same task in a synchronous manner. In addition, in
order to use our approach in real time, we propose
a solution based on the image processing and a ma-
chine learning. As result, we show that it is possible
to move a rigid formation of robots in a constraint en-
vironment.
The reminder of this paper is organizedas follows.
In section 2, we describe the proposed approach and
mainly we expose the solution that we used to control
the formation. The learning process used to compute
the path planning is outlined in detail in section 3. The
simulation results have showed in section 4. Section
5 gives conclusion and presents further works.
2 CONTROL STRATEGY FOR
THE ROBOTS’ FORMATION
The use of a multi-robot systems to transport bulky
objects is an elegant solution for this kind of prob-
lem and that is a very flexible solution. Effectively,
sometimes this task needs to design specific vehicles
according to some constraints which come from the
object to carry.
In this paper, and without any loss of generality,
we will consider only a formation with three robots
(see Fig.1). Furthermore, it must be pointed out that
in this work we focus only on the high level con-
trol. And we consider that there is a low level control
which is able to maintain the rigidity in the forma-
300
Wang T., Sabourin C. and Madani K..
STRATEGY BASED ON MACHINE LEARNING FOR THE CONTROL OF A RIGID FORMATION IN A MULTI-ROBOTS FRAME.
DOI: 10.5220/0003535503000303
In Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2011), pages 300-303
ISBN: 978-989-8425-75-1
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
Figure 1: Schematic description of a rigid formation with
three wheeled robots. The relative distance between two
robots is a constant value and all robots are the same orien-
tation (ψ = 0
figure (a), ψ = 45
figure (b)).
tion of robots. This formation is formed by wheeled
robots moving in a plane. Each robot may be one
nonholonomerobot (e.g. unicycle-type mobile) or an-
other kinds (e.g. omni-directional).
The modeling used to describe the formation (rep-
resented on the Fig.1) is based on the following con-
cepts:
There is a reference robot in the formation (e.g.
robot 1 on the Fig.1). The both position and ori-
entation of this robot corresponds to the both posi-
tion and orientation of the formation according to
an absolute frame. It must be pointed out that the
reference robot is not necessarily a leader robot.
The position of the other robots i are defined ac-
cording to the position of robots i-1. Two parame-
ters are used to specify this position. There are the
relative distance between two robots (e.g. l
12
and
l
23
on the Fig.1) and the absolute angle between
the orientation of the robot and a normal direction
(e.g. θ
12
and θ
23
on the Fig.1).
The formation is rigid which means that the rel-
ative distance between two robots is a constant
value and all robots have the same orientation. It
must be noticed that the orientation of formation
(ψ) is independent of the angle φ used to describe
the relative position between two robots (ψ = 0
Fig.1(a), ψ = 45
Fig.1(b) ).
The state of the formation is given by the posi-
tion of the reference robots (e.g. the robot number
1) and the orientation of the formation. We define 8
possible orientations (see Fig.2) dependent directly of
the angle θ: θ = 0
for the formation 1, θ = 45
for
the formation 2, θ = 90
for the formation 3, and so
on. Consequently, this orientation is independent of
the angle ψ. For each orientation, we define height
possible actions (from 1 to 8) (see Figure 3). These
1
3
2
3
2
2
2
3
2
3
3
3
3
2
2
3
2
5
6
7
4
3
2
1
8
Figure 2: Schematic description of the 8 formations.
1
2
3
4
5
6
7
8
9
10
robot
1
robot
2
robot
3
Figure 3: Actions used to control the formation.
actions correspond to the directions (angle ψ) allow-
ing to move the formation of robots. Two other ac-
tions allow to do a rotation of the formation (Action 9
and 10 are the clockwise and anti clockwise rotation
action respectively). Because we consider nonholo-
nomic robots, if the rigid formation needs to move in
a desired direction (for example formation 7 and ac-
tion 3), each robot in the formation simultaneously
rotates to the desired direction by using an orientation
control and after go forwards in the desired direction.
3 PATH PLANNING FOR A RIGID
FORMATION OF ROBOTS
Based on the previous description of our control strat-
egy, the problem is now how to compute the path
planning (to find action like ”forwards”, ”left”, and
so on) to move the formation from an initial point
to a goal point in a constraint environment (obstacle
avoidance, narrow path, etc..). In order to design on-
line approach for real-time application, we have based
our concept on two levels. The first one allows us
to get the equivalent quantified environment, and the
second carry out a learning process in order to find the
set of the best actions.
3.1 Image Processing
In order to get automatically numerical information
allowing to represent environment, we have devel-
oped an approach based on image processing. This
procedure uses the following process:
STRATEGY BASED ON MACHINE LEARNING FOR THE CONTROL OF A RIGID FORMATION IN A
MULTI-ROBOTS FRAME
301
Figure 4: Picture of the virtual environment given by a vir-
tual camera located at the top of environment.
Taking the photo of the environment. It should be
noticed that in order to simplify, in the first time,
we consider we have a virtual camera located at
the top of environment (see for example Fig.4).
Modify the RGB image to the gray scale image,
and convert the gray image into a binary image
with a suitable threshold value.
The last step allows to describe environment by a
binary matrix, in which 1 represents obstacles and
0 is free path.
After the image processing, it is possible to get the
equivalent quantified environment (see Fig.5 ). In this
example, the environment is divided into 100 states
where each state is a square with sides 30cm. For the
quantified environment, the obstacles are marked with
red stars. The formation is composed with a reference
robot (the robot 1 represented like a solid black dot on
the Fig.5) and other robots (robot 2 and 3 represented
like black circle). In initial position, the robots stand
in the position A which is in the state [6;10]. The final
state position B is in state [4;10].
Figure 5: Equivalent quantified environment representation:
the obstacles are marked with red stars, the reference robot
is represented like a solid black dot and other robots repre-
sented like black circle.
3.2 Q Learning
Now, our aim is to find a solution allowing to move
the three robots from point A to point B in conserving
a virtual rigid formation (a line in this case). As has
been noted in the section 3.1, the environment may be
described by a matrix E(10× 10). This matrix con-
tains information about the position of the obstacles.
But, it is not sufficient to describe the full state S. Ef-
fectively, one state should be composed of two infor-
mation which are the position of the reference robot,
and the kind of the formation. Consequently, the size
of the set S is equal to 10× 10× 8
On the base of the last description of the state S,
and being given that we consider the formation like
only one robot, it is possible to look for the succession
of actions in order to move the formation from the
point A to another point B. To solve this problem, one
solution consists to use a reinforcement learning. The
goal of the reinforcement learning algorithm is to find
the action which maximizes a reinforcement signal.
The reinforcementsignal provides an indication of the
interest of last chosen actions. Q-Learning, proposed
by Watkins (Sutton and Barto, 1998), is a good way
to use reinforcement learning strategy.
4 SIMULATION
In this section, we present results of simulation for the
example described in the section 3. The frame of this
simulated environment is composed of three robots
KheperaIII
1
and a top camera. Simulations have been
performed by using software Webots
2
and controllers
have been designed with the software Matlab
3
.
As described in the previous section, the world is
a square with 3 meters sides. And it is divided into
100 small squares, where each squares (0.3x0.3 m)
represents possible position of one robot. Concerning
the KheperaIII, we suppose that robots are always lo-
cated in the center of the square. And we suppose that
standard robot 1 permanently moves from the posi-
tion of one center of the square to another. The other
two robots rotate around the standard robot so as to
change the formation. The radii are 0.3m and 0.6m
respectively by taking the reference robot as the ref-
erence frame. Throughout the whole process, robots
run the same operation synchronously. When the path
planning is computed, then it is possible to compute
the reference trajectories for the robots of the forma-
tion. It is be noticed that as we use nonholonomic
1
www.k-team.com/
2
www.cyberbotics.com/
3
www.mathworks.com/
ICINCO 2011 - 8th International Conference on Informatics in Control, Automation and Robotics
302
Figure 6: Snapshot of simulation’s results. The virtual envi-
ronment is composed of three robots KheperaIII which have
to move by avoiding obstacles but with a rigid formation.
robots, trajectories are decomposed into rotations and
linear motions. Fig.6 shows snapshot of simulation’s
results. As depicted on the pictures, we can observe
that the three robots move from the initial point to the
final point in keeping the rigid formation. The possi-
ble actions (from 1 to 10) are chosen according to the
environment constraints.
The Fig.6(a) shows the initial position of robots,
and their formation (formation 7). After a series of
translation actions (action 6), robots arrive in the po-
sition showing by Fig.6(b). Then, robots must change
their formation in order to avoid obstacles. Fig.6 (c)
and Fig.6 (d) show the rotation actions. Then robots
with formation 5 (in vertical) move to the upper left
by three translation actions. The beginning and final
position are Fig.6 (e) and Fig.6 (f). After that, both the
translation actions and rotation actions have done, and
the two snapshot of the motion in this stage are Fig.6
(g) and Fig.6 (h). At the end of the path, robots ro-
tate back to horizontal orientation (with formation 3)
but on the opposite to the initial formation 7, in which
robot 1 is on the right of the formation. And robots
translate again along the upper left till they arrive to
the top left of the environment. The final position is
in Fig.6 (i) with formation 3.
5 CONCLUSIONS
In this paper, we have described a new approach for
the control of a rigid formation of robots in the frame
of logistic applications. In the proposed solution, we
have considered the formation as a single robot and
our work has focused on how to control the forma-
tion. To use our approach in real-time, we have pro-
posed a solution based on the image processing and
a machine learning. As result, we have shown that it
was possible to move a rigid formation of robots be-
tween two points in a constraint environment. Some
results about simulation have proven the efficiency of
the proposed method.
Future works will focus on the improvement of
the proposed method, namely on the collaboration be-
tween all agents. And we will investigate realistic
problems in the domain of industrial application.
ACKNOWLEDGEMENTS
This work was supported by a doctoral fellowship
from the China Scholarship Council (CSC). Authors
wish to express their gratitude to CSC.
REFERENCES
Barfoot, T. D. and Clark, C. M. (2004). Motion planning
for formations of mobile robots. Robotics and Au-
tonomous Systems, 46(2):65 – 78.
Cao, Y. U., Fukunaga, A. S., and Kahng, A. (1997). Coop-
erative mobile robotics: Antecedents and directions.
Autonomous Robots, 4:7–27.
Mastellone, S., Stipanovic, D. M., Graunke, C. R.,
Intlekofer, K. A., and Spong, M. W. (2008). Forma-
tion control and collision avoidance for multi-agent
non-holonomic systems: Theory and experiments. I.
J. Robotic Res., 27(1):107–126.
Parker, L. E. (2008). Multiple mobile robot systems. In
Siciliano, B. and Khatib, O., editors, Springer Hand-
book of Robotics, pages 921–941. Springer Berlin
Heidelberg.
Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learn-
ing: An Introduction. The MIT Press.
Wurman, P. R., Andrea, R. D., and Mountz, M. (2008). Co-
ordinating hundreds of cooperative, autonomous vehi-
cles in warehouses. AI Magazine, 29(1):9–20.
STRATEGY BASED ON MACHINE LEARNING FOR THE CONTROL OF A RIGID FORMATION IN A
MULTI-ROBOTS FRAME
303