Strategy-planned Q-learning Approach for Multi-robot Task Allocation

H. Hilal Ezercan Kayir, Osman Parlaktuna

2014

Abstract

In market-based task allocation mechanism, a robot bids for the announced task if it has the ability to perform the task and is not busy with another task. Sometimes a high-priority task may not be performed because all the robots are occupied with low-priority tasks. If the robots have an expectation about future task sequence based-on their past experiences, they may not bid for the low-priority tasks and wait for the high-priority tasks. In this study, a Q-learning-based approach is proposed to estimate the time-interval between high-priority tasks in a multi-robot multi-type task allocation problem. Depending on this estimate, robots decide to bid for a low-priority task or wait for a high-priority task. Application of traditional Q-learning for multi-robot systems is problematic due to non-stationary nature of working environment. In this paper, a new approach, Strategy-Planned Distributed Q-Learning algorithm which combines the advantages of centralized and distributed Q-learning approaches in literature is proposed. The effectiveness of the proposed algorithm is demonstrated by simulations on task allocation problem in a heterogeneous multi-robot system.

References

  1. Boutlier C., 1996, Planning, learning and coordination in multiagent decision processes, Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, TARK 7896, pp. 195-210.
  2. Busoniu L., Babuška R., Schutter B., 2008, A comprehensive survey of multiagent reinforcement learning, IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews, vol.38, no.2, pp. 156-172.
  3. Dias M. B., Zlot R. M., Kaltra N., Stentz A., 2006, Market-based multirobot coordination: a survey and analysis, Proceedings of the IEEE, vol. 94, no.7, pp. 1257-1270.
  4. Gerkey B. P., Mataric M. J., 2002, Sold!: Auction methods for multi robot coordination, IEEE Transactions on Robotics and Automation, vol. 18, no. 5, pp. 758-768.
  5. Gerkey B. P., Mataric M. J., 2004, A formal analysis and taxonomy of task allocation in multi-robot systems, International Journal of Robotics Research, 23(9), pp. 939-954.
  6. Hatime H., Pendse R., Watkins J. M., 2013, A comparative study of task allocation strategies in multi-robot systems, IEEE Sensors Journal, vol. 13, no. 1, 253-262.
  7. Hu J., Wellman M. P., 1998, Multiagent reinforcement learning: theoretical framework and an algorithm, Proceedings of the Fifteenth International Conference on Machine Learning ICML'98, pp. 242-250.
  8. Hu J., Wellman M. P., 2003, Nash Q-learning for general sum games, Journal of Machine Learning Research, 4, pp. 1039-1069.
  9. Jones E. G., Dias M. B., Stentz A., 2007, Learningenhanced market-based task allocation for oversubscribed domains, Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, pp. 2308- 2313.
  10. Kaleci B., Parlaktuna O., Ozkan M., Kirlik G., 2010, Market-based task allocation by using assignment problem, IEEE International Conference on Systems, Man, and Cybernetics, pp. 135-14.
  11. Mataric M. J., 1997, Reinforcement learning in multirobot domain, Autonomous Robots, 4(1), pp. 73-83.
  12. Matignon L., Laurent G. J., Le Fort-Piat N., 2007, Hysteretic Q-learning: an algorithm for decentralized reinforcement learning in cooperative multi-agent teams, Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, pp. 64-69.
  13. Mosteo A. R., Montano L., 2007, Comparative experiments on optimization criteria and algorithms for auction based multi-robot task allocation, Proceedings of the IEEE International. Conference on Robotics and Automation, pp. 3345-3350.
  14. Russel S., Norvig P., 2003, Artificial intelligence a modern approach, Prentice Hall, New Jersey.
  15. Sutton R. S., Barto A. G., 1998, Reinforcement learning: an introduction, MIT Press, Cambridge.
  16. Wang Y., de Silva C. W., 2006, Multi-robot box-pushing: single-agent Q-learning vs. team Q-learning, Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 3694-3699.
  17. Watkins C. J., 1989, Learning from delayed rewards, University of Cambridge, UK, PhD Thesis.
  18. Watkins C. J., Dayan P., 1992, Q-learning, Machine Learning, vol. 8.
  19. Yang E., Gu D., 2004, Multiagent reinforcement learning for multi-robot systems: a survey, CSM-404, Technical Reports of the Department of Computer Science, University of Essex.
Download


Paper Citation


in Harvard Style

Hilal Ezercan Kayir H. and Parlaktuna O. (2014). Strategy-planned Q-learning Approach for Multi-robot Task Allocation . In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-758-040-6, pages 410-416. DOI: 10.5220/0005052504100416


in Bibtex Style

@conference{icinco14,
author={H. Hilal Ezercan Kayir and Osman Parlaktuna},
title={Strategy-planned Q-learning Approach for Multi-robot Task Allocation},
booktitle={Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2014},
pages={410-416},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005052504100416},
isbn={978-989-758-040-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - Strategy-planned Q-learning Approach for Multi-robot Task Allocation
SN - 978-989-758-040-6
AU - Hilal Ezercan Kayir H.
AU - Parlaktuna O.
PY - 2014
SP - 410
EP - 416
DO - 10.5220/0005052504100416