significant difference between these values is clear
although the system considered here can be specified
as a simple system. So, by taking into account the
task completion ratios and computational loads, it is
evident that the proposed approach, strategy-planned
distributed Q-learning, yields appropriate and useful
results.
8 CONCLUSIONS
In this paper, a new learning-based task allocation
approach, Strategy-Planned Distributed Q-Learning,
is proposed. Traditional Q-learning algorithm is
defined in MDP environments. But MRS
environments are no longer Markovian because of
unpredicted behaviours of other robots and presence
of uncertainties. There are two major approaches
about Q-learning for multi-agent systems,
distributed and centralized approaches. The
proposed algorithm combines the advantages of
distributed and centralized approaches. It is a
distributed learning approach in nature but it assigns
to robots different learning strategies in a centralized
manner. Experimental results show that task
completion ratio of high-priority tasks gets higher
for all three learning approaches because the robots
make use of their past task allocation experiences for
future task execution through their learning ability.
The experimental results show that the centralized
learning approach produces the best solutions about
task completion ratios of both high-priority and low-
priority tasks. The proposed approach results in a bit
less task completion ratios than centralized
approach. However, it is indicated that the proposed
algorithm provides reasonable solutions with its low
learning space dimension and computational load.
REFERENCES
Boutlier C., 1996, Planning, learning and coordination in
multiagent decision processes, Proceedings of the 6th
Conference on Theoretical Aspects of Rationality and
Knowledge, TARK '96, pp. 195-210.
Buşoniu L., Babuška R., Schutter B., 2008, A
comprehensive survey of multiagent reinforcement
learning, IEEE Transactions on Systems, Man, and
Cybernetics – Part C: Applications and Reviews,
vol.38, no.2, pp. 156-172.
Dias M. B., Zlot R. M., Kaltra N., Stentz A., 2006,
Market-based multirobot coordination: a survey and
analysis, Proceedings of the IEEE, vol. 94, no.7, pp.
1257-1270.
Gerkey B. P., Mataric M. J., 2002, Sold!: Auction methods
for multi robot coordination, IEEE Transactions on
Robotics and Automation, vol. 18, no. 5, pp. 758-768.
Gerkey B. P., Mataric M. J., 2004, A formal analysis and
taxonomy of task allocation in multi-robot systems,
International Journal of Robotics Research, 23(9), pp.
939-954.
Hatime H., Pendse R., Watkins J. M., 2013, A
comparative study of task allocation strategies in
multi-robot systems, IEEE Sensors Journal, vol. 13,
no. 1, 253-262.
Hu J., Wellman M. P., 1998, Multiagent reinforcement
learning: theoretical framework and an algorithm,
Proceedings of the Fifteenth International Conference
on Machine Learning ICML’98, pp. 242-250.
Hu J., Wellman M. P., 2003, Nash Q-learning for general
sum games, Journal of Machine Learning Research, 4,
pp. 1039-1069.
Jones E. G., Dias M. B., Stentz A., 2007, Learning-
enhanced market-based task allocation for
oversubscribed domains, Proceedings of the 2007
IEEE/RSJ International Conference on Intelligent
Robots and Systems, San Diego, CA, USA, pp. 2308-
2313.
Kaleci B., Parlaktuna O., Ozkan M., Kırlık G., 2010,
Market-based task allocation by using assignment
problem, IEEE International Conference on Systems,
Man, and Cybernetics, pp. 135-14.
Mataric M. J., 1997, Reinforcement learning in multi-
robot domain, Autonomous Robots, 4(1), pp. 73-83.
Matignon L., Laurent G. J., Le Fort-Piat N., 2007,
Hysteretic Q-learning: an algorithm for decentralized
reinforcement learning in cooperative multi-agent
teams, Proceedings of the 2007 IEEE/RSJ
International Conference on Intelligent Robots and
Systems, San Diego, CA, USA, pp. 64-69.
Mosteo A. R., Montano L., 2007, Comparative
experiments on optimization criteria and algorithms
for auction based multi-robot task allocation,
Proceedings of the IEEE International. Conference on
Robotics and Automation, pp. 3345-3350.
Russel S., Norvig P., 2003, Artificial intelligence a
modern approach, Prentice Hall, New Jersey.
Sutton R. S., Barto A. G., 1998, Reinforcement learning:
an introduction, MIT Press, Cambridge.
Wang Y., de Silva C. W., 2006, Multi-robot box-pushing:
single-agent Q-learning vs. team Q-learning,
Proceedings of the 2006 IEEE/RSJ International
Conference on Intelligent Robots and Systems
,
Beijing, China, pp. 3694–3699.
Watkins C. J., 1989, Learning from delayed rewards,
University of Cambridge, UK, PhD Thesis.
Watkins C. J., Dayan P., 1992, Q-learning, Machine
Learning, vol. 8.
Yang E., Gu D., 2004, Multiagent reinforcement learning
for multi-robot systems: a survey, CSM-404,
Technical Reports of the Department of Computer
Science, University of Essex.
ICINCO2014-11thInternationalConferenceonInformaticsinControl,AutomationandRobotics
416