A DISTRIBUTED REINFORCEMENT LEARNING CONTROL ARCHITECTURE FOR MULTI-LINK ROBOTS - Experimental Validation

Jose Antonio Martin H., Javier de Lope

Abstract

A distributed approach to Reinforcement Learning (RL) in multi-link robot control tasks is presented. One of the main drawbacks of classical RL is the combinatorial explosion when multiple states variables and multiple actuators are needed to optimally control a complex agent in a dynamical environment. In this paper we present an approach to avoid this drawback based on a distributed RL architecture. The experimental results in learning a control policy for diverse kind of multi-link robotic models clearly shows that it is not necessary that each individual RL-agent perceives the complete state space in order to learn a good global policy but only a reduced state space directly related to its own environmental experience. The proposed architecture combined with the use of continuous reward functions results of an impressive improvement of the learning speed making tractable some learning problems in which a classical RL with discrete rewards (-1,0,1) does not work.

References

  1. El-Fakdi, A., Carreras, M., and Ridao, P. (2005). Direct gradient-based reinforcement learning for robot behavior learning. In ICINCO 2005, pages 225-231. INSTICC Press.
  2. Franklin, J. A. (1988). Refinement of robot motor skills through reinforcement learning. In Proc. of 27th Conf. on Decision and Control, pages 1096-1101, Austin, Texas.
  3. Kalmar, Szepesvari, and Lorincz (2000). Modular reinforcement learning: A case study in a robot domain. ACTACYB: Acta Cybernetica, 14.
  4. Kretchmar, R. M. (2000). A synthesis of reinforcement learning and robust control theory. PhD thesis, Colorado State University.
  5. Lin, L.-J. (1993). Scaling up reinforcement learning for robot control. Machine Learning.
  6. Martin-H., J. A. and De-Lope, J. (2006). Dynamic goal coordination in physical agents. In ICINCO 2006, pages 154-159. INSTICC Press.
  7. Mataric, M. J. (1997). Reinforcement learning in the multirobot domain. Auton. Robots, 4(1):73-83.
  8. Rubo, Z., Yu, S., Xingoe, W., Guangmin, Y., and Guochang, G. (2000). Research on reinforcement learning of the intelligent robot based on self-adaptive quantization. In Proc. of the 3rd World Congr. on Intelligent Control and Automation. IEEE, Piscataway, NJ, USA, volume 2, pages 1226-9.
  9. Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning, An Introduction. The MIT press.
  10. Tsitsiklis, J. N. and Roy, B. V. (1996). Analysis of temporaldiffference learning with function approximation. In Mozer, M., Jordan, M. I., and Petsche, T., editors, NIPS, pages 1075-1081. MIT Press.
  11. Venturini, G. (1994). Adaptation in dynamic environments through a minimal probability of exploration. In SAB94, pages 371-379, Cambridge, MA, USA. MIT Press.
  12. Watkins, C. J. (1989). Models of Delayed Reinforcement Learning. PhD thesis, Psychology Department, Cambridge University, Cambridge, United Kingdom.
  13. Watkins, C. J. and Dayan, P. (1992). Technical note Qlearning. Machine Learning, 8:279.
  14. Yamada, S., Watanabe, A., and Nakashima, M. (1997). Hybrid reinforcement learning and its application to biped robot control. In NIPS. The MIT Press.
Download


Paper Citation


in Harvard Style

Antonio Martin H. J. and de Lope J. (2007). A DISTRIBUTED REINFORCEMENT LEARNING CONTROL ARCHITECTURE FOR MULTI-LINK ROBOTS - Experimental Validation . In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-972-8865-82-5, pages 192-197. DOI: 10.5220/0001621201920197


in Bibtex Style

@conference{icinco07,
author={Jose Antonio Martin H. and Javier de Lope},
title={A DISTRIBUTED REINFORCEMENT LEARNING CONTROL ARCHITECTURE FOR MULTI-LINK ROBOTS - Experimental Validation},
booktitle={Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2007},
pages={192-197},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001621201920197},
isbn={978-972-8865-82-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - A DISTRIBUTED REINFORCEMENT LEARNING CONTROL ARCHITECTURE FOR MULTI-LINK ROBOTS - Experimental Validation
SN - 978-972-8865-82-5
AU - Antonio Martin H. J.
AU - de Lope J.
PY - 2007
SP - 192
EP - 197
DO - 10.5220/0001621201920197