Strategy-planned Q-learning Approach for Multi-robot Task Allocation

H. Hilal Ezercan Kayir; Osman Parlaktuna

doi:10.5220/0005052504100416

Strategy-planned Q-learning Approach for Multi-robot Task Allocation

H. Hilal Ezercan Kayir, Osman Parlaktuna

2014

Abstract

In market-based task allocation mechanism, a robot bids for the announced task if it has the ability to perform the task and is not busy with another task. Sometimes a high-priority task may not be performed because all the robots are occupied with low-priority tasks. If the robots have an expectation about future task sequence based-on their past experiences, they may not bid for the low-priority tasks and wait for the high-priority tasks. In this study, a Q-learning-based approach is proposed to estimate the time-interval between high-priority tasks in a multi-robot multi-type task allocation problem. Depending on this estimate, robots decide to bid for a low-priority task or wait for a high-priority task. Application of traditional Q-learning for multi-robot systems is problematic due to non-stationary nature of working environment. In this paper, a new approach, Strategy-Planned Distributed Q-Learning algorithm which combines the advantages of centralized and distributed Q-learning approaches in literature is proposed. The effectiveness of the proposed algorithm is demonstrated by simulations on task allocation problem in a heterogeneous multi-robot system.

References

Boutlier C., 1996, Planning, learning and coordination in multiagent decision processes, Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, TARK 7896, pp. 195-210.
Busoniu L., Babuška R., Schutter B., 2008, A comprehensive survey of multiagent reinforcement learning, IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, vol.38, no.2, pp. 156-172.
Dias M. B., Zlot R. M., Kaltra N., Stentz A., 2006, Market-based multirobot coordination: a survey and analysis, Proceedings of the IEEE, vol. 94, no.7, pp. 1257-1270.
Gerkey B. P., Mataric M. J., 2002, Sold!: Auction methods for multi robot coordination, IEEE Transactions on Robotics and Automation, vol. 18, no. 5, pp. 758-768.
Gerkey B. P., Mataric M. J., 2004, A formal analysis and taxonomy of task allocation in multi-robot systems, International Journal of Robotics Research, 23(9), pp. 939-954.
Hatime H., Pendse R., Watkins J. M., 2013, A comparative study of task allocation strategies in multi-robot systems, IEEE Sensors Journal, vol. 13, no. 1, 253-262.
Hu J., Wellman M. P., 1998, Multiagent reinforcement learning: theoretical framework and an algorithm, Proceedings of the Fifteenth International Conference on Machine Learning ICML'98, pp. 242-250.
Hu J., Wellman M. P., 2003, Nash Q-learning for general sum games, Journal of Machine Learning Research, 4, pp. 1039-1069.
Jones E. G., Dias M. B., Stentz A., 2007, Learningenhanced market-based task allocation for oversubscribed domains, Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, pp. 2308- 2313.
Kaleci B., Parlaktuna O., Ozkan M., Kirlik G., 2010, Market-based task allocation by using assignment problem, IEEE International Conference on Systems, Man, and Cybernetics, pp. 135-14.
Mataric M. J., 1997, Reinforcement learning in multirobot domain, Autonomous Robots, 4(1), pp. 73-83.
Matignon L., Laurent G. J., Le Fort-Piat N., 2007, Hysteretic Q-learning: an algorithm for decentralized reinforcement learning in cooperative multi-agent teams, Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, pp. 64-69.
Mosteo A. R., Montano L., 2007, Comparative experiments on optimization criteria and algorithms for auction based multi-robot task allocation, Proceedings of the IEEE International. Conference on Robotics and Automation, pp. 3345-3350.
Russel S., Norvig P., 2003, Artificial intelligence a modern approach, Prentice Hall, New Jersey.
Sutton R. S., Barto A. G., 1998, Reinforcement learning: an introduction, MIT Press, Cambridge.
Wang Y., de Silva C. W., 2006, Multi-robot box-pushing: single-agent Q-learning vs. team Q-learning, Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 3694-3699.
Watkins C. J., 1989, Learning from delayed rewards, University of Cambridge, UK, PhD Thesis.
Watkins C. J., Dayan P., 1992, Q-learning, Machine Learning, vol. 8.
Yang E., Gu D., 2004, Multiagent reinforcement learning for multi-robot systems: a survey, CSM-404, Technical Reports of the Department of Computer Science, University of Essex.

Download

Paper Citation

in Harvard Style

Hilal Ezercan Kayir H. and Parlaktuna O. (2014). Strategy-planned Q-learning Approach for Multi-robot Task Allocation . In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-758-040-6, pages 410-416. DOI: 10.5220/0005052504100416

in Bibtex Style

@conference{icinco14,
author={H. Hilal Ezercan Kayir and Osman Parlaktuna},
title={Strategy-planned Q-learning Approach for Multi-robot Task Allocation},
booktitle={Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2014},
pages={410-416},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005052504100416},
isbn={978-989-758-040-6},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - Strategy-planned Q-learning Approach for Multi-robot Task Allocation
SN - 978-989-758-040-6
AU - Hilal Ezercan Kayir H.
AU - Parlaktuna O.
PY - 2014
SP - 410
EP - 416
DO - 10.5220/0005052504100416