A REINFORCEMENT LEARNING APPROACH FOR MULTIAGENT NAVIGATION
Francisco Martinez-Gil, Fernando Barber, Miguel Lozano, Francisco Grimaldo, Fernando Fernandez
2010
Abstract
This paper presents a Q-Learning-based multiagent system oriented to provide navigation skills to simulation agents in virtual environments. We focus on learning local navigation behaviours from the interactions with other agents and the environment. We adopt an environment-independent state space representation to provide the required scalability of such kind of systems. In this way, we evaluate whether the learned action-value functions can be transferred to other agents to increase the size of the group without loosing behavioural quality. We explain the learning process defined and the the results of the collective behaviours obtained in a well-known experiment in multiagent navigation: the exit of a place through a door.
References
- Fernández, F., Borrajo, D., and Parker, L. (2005). A reinforcement learning algorithm in cooperative multirobot domains. Journal of Intelligent Robotics Systems, 43(2-4):161-174.
- Helbing, D., Farkas, I., and Vicsek, T. (2000). Simulating dynamical features of escape panic. Nature, 407:487.
- Howard, R. A. (1960). Dynamic Programming and Markov Processes. The MIT Press.
- Kaelbling, L. P., Littman, M. L., and Moore, A. W. (1996). Reinforcement learning: A survey. Int. Journal of Artificial Intelligence Research, 4:237-285.
- Lozano, M., Morillo, P., OrduÁa, J. M., Cavero, V., and Vigueras, G. (2008). A new system architecture for crowd simulation. J. Networks and Comp. App.
- Reynolds, C. W. (1987). Flocks, herds and schools: A distributed behavioral model. In SIGGRAPH 7887, pages 25-34, New York, NY, USA. ACM.
- Stone, P., Sutton, R. S., and Kuhlmann, G. (2005). Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3).
- Taylor, M. E. and Stone, P. (2005). Behavior transfer for value-function-based reinforcement learning. In 4th I.J.C. Autonomous Agents and Multiagent Systems.
- Watkins, C. and Dayan, P. (1992). Q-learning. Machine Learning, 8:279-292.
Paper Citation
in Harvard Style
Martinez-Gil F., Barber F., Lozano M., Grimaldo F. and Fernandez F. (2010). A REINFORCEMENT LEARNING APPROACH FOR MULTIAGENT NAVIGATION . In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-674-021-4, pages 607-610. DOI: 10.5220/0002727906070610
in Bibtex Style
@conference{icaart10,
author={Francisco Martinez-Gil and Fernando Barber and Miguel Lozano and Francisco Grimaldo and Fernando Fernandez},
title={A REINFORCEMENT LEARNING APPROACH FOR MULTIAGENT NAVIGATION},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2010},
pages={607-610},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002727906070610},
isbn={978-989-674-021-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A REINFORCEMENT LEARNING APPROACH FOR MULTIAGENT NAVIGATION
SN - 978-989-674-021-4
AU - Martinez-Gil F.
AU - Barber F.
AU - Lozano M.
AU - Grimaldo F.
AU - Fernandez F.
PY - 2010
SP - 607
EP - 610
DO - 10.5220/0002727906070610