A Hybrid Interaction Model for Multi-Agent Reinforcement Learning

Douglas M. Guisi, Richardson Ribeiro, Marcelo Teixeira, André P. Borges, Eden R. Dosciatti, Fabrício Enembreck

2016

Abstract

The main contribution of this paper is to implement a hybrid method of coordination from the combination of interaction models developed previously. The interaction models are based on the sharing of rewards for learning with multiple agents in order to discover interactively good quality policies. Exchange of rewards among agents, when not occur properly, can cause delays in learning or even cause unexpected behavior, making the cooperation inefficient and converging to a non-satisfactory policy. From these concepts, the hybrid method uses the characteristics of each model, reducing possible conflicts between different policy actions with rewards, improving the coordination of agents in reinforcement learning problems. Experimental results show that the hybrid method can accelerate the convergence, rapidly gaining optimal policies even in large spaces of states, exceeding the results of classical approaches to reinforcement learning.

References

  1. Chakraborty, D. and Stone, P. (2014). Multiagent learning in the presence of memory-bounded agents. Autonomous Agents and Multi-Agent Systems, 28(2):182-213.
  2. Chapelle, J., Simonin, O., and Ferber, J. (2002). How situated agents can learn to cooperate by monitoring their neighbors' satisfaction. ECAI, 2:68-78.
  3. DeLoach, S. and Valenzuela, J. (2007). An agent environment interaction model. In Padgham, L. and Zambonelli, F., editors, Agent-Oriented Software Engineering VII, volume 4405 of Lecture Notes in Computer Science, pages 1-18. Springer Berlin Heidelberg.
  4. Devlin, S., Yliniemi, L., Kudenko, D., and Tumer, K. (2014). Potential-based difference rewards for multiagent reinforcement learning. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, AAMAS 7814, pages 165-172, Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems.
  5. Efthymiadis, K. and Kudenko, D. (2015). Knowledge revision for reinforcement learning with abstract mdps. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 7815, pages 763-770, Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems.
  6. Grze's, M. and Hoey, J. (2011). Efficient planning in rmax. In The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3, AAMAS 7811, pages 963-970, Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems.
  7. Kaelbling, L. P., Littman, M. L., and Moore, A. P. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237-285.
  8. Ribeiro, R., Borges, A., and Enembreck, F. (2008). Interaction models for multiagent reinforcement learning. In Computational Intelligence for Modelling Control Automation, 2008 International Conference on, pages 464-469.
  9. Ribeiro, R. and Enembreck, F. (2013). A sociologically inspired heuristic for optimization algorithms: A case study on ant systems. Expert Systems with Applications, 40(5):1814 - 1826.
  10. Ribeiro, R., Enembreck, F., and Koerich, A. (2006). A hybrid learning strategy for discovery of policies of action. In Sichman, J., Coelho, H., and Rezende, S., editors, Advances in Artificial Intelligence - IBERAMIASBIA 2006, volume 4140 of Lecture Notes in Computer Science, pages 268-277. Springer Berlin Heidelberg.
  11. Ribeiro, R., Ronszcka, A. F., Barbosa, M. A. C., Favarim, F., and Enembreck, F. (2013). Updating strategies of policies for coordinating agent swarm in dynamic environments. In Hammoudi, S., Maciaszek, L. A., Cordeiro, J., and Dietz, J. L. G., editors, ICEIS (1), pages 345-356. SciTePress.
  12. Saito, M. and Kobayashi, I. (2016). A study on efficient transfer learning for reinforcement learning using sparse coding. Automation and Control Engineering, 4(4):324 - 330.
  13. Stone, P. and Veloso, M. (2000). Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8(3):345-383.
  14. Press. Tesauro, G. (1995). Temporal difference learning and tdgammon. Commun. ACM, 38(3):58-68.
  15. Walsh, T. J., Goschin, S., and Littman, M. L. (2010). Integrating sample-based planning and model-based reinforcement learning. In Fox, M. and Poole, D., editors, AAAI. AAAI Press.
  16. Watkins, C. J. C. H. and Dayan, P. (1992). Q-learning. Machine Learning, 8(3):272-292.
  17. Xinhai, X. and Lunhui, X. (2009). Traffic signal control agent interaction model based on game theory and reinforcement learning. In Computer ScienceTechnology and Applications, 2009. IFCSTA 7809. International Forum on, volume 1, pages 164-168.
  18. Xuan, P. and Lesser, V. (2002). Multi-Agent Policies: From Centralized Ones to Decentralized Ones. Proceedings of the 1st International Joint Conference on Autonomous Agents and Multiagent Systems, Part 3:1098-1105.
  19. Zhang, C. and Lesser, V. (2010). Multi-Agent Learning with Policy Prediction. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, pages 927-934, Atlanta.
  20. Zhang, C. and Lesser, V. (2013). Coordinating MultiAgent Reinforcement Learning with Limited Communication. In Ito, J. and Gini, S., editors, Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems, pages 1101-1108, St. Paul, MN. IFAAMAS.
Download


Paper Citation


in Harvard Style

Guisi D., Ribeiro R., Teixeira M., Borges A., Dosciatti E. and Enembreck F. (2016). A Hybrid Interaction Model for Multi-Agent Reinforcement Learning . In Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-187-8, pages 54-61. DOI: 10.5220/0005832300540061


in Bibtex Style

@conference{iceis16,
author={Douglas M. Guisi and Richardson Ribeiro and Marcelo Teixeira and André P. Borges and Eden R. Dosciatti and Fabrício Enembreck},
title={A Hybrid Interaction Model for Multi-Agent Reinforcement Learning},
booktitle={Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2016},
pages={54-61},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005832300540061},
isbn={978-989-758-187-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A Hybrid Interaction Model for Multi-Agent Reinforcement Learning
SN - 978-989-758-187-8
AU - Guisi D.
AU - Ribeiro R.
AU - Teixeira M.
AU - Borges A.
AU - Dosciatti E.
AU - Enembreck F.
PY - 2016
SP - 54
EP - 61
DO - 10.5220/0005832300540061