PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS

Andrea Bonarini; Alessandro Lazaric; Marcello Restelli

doi:10.5220/0001649102140221

PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS

Andrea Bonarini, Alessandro Lazaric, Marcello Restelli

2007

Abstract

Writing good behaviors for mobile robots is a hard task that requires a lot of hand tuning and often fails to consider all the possible configurations that a robot may face. By using reinforcement learning techniques a robot can improve its performance through a direct interaction with the surrounding environment and adapt its behavior in response to some non-stationary events, thus achieving a higher degree of autonomy with respect to pre-programmed robots. In this paper, we propose a novel reinforcement learning approach that addresses the main issues of learning in real-world robotic applications: experience is expensive, explorative actions are risky, control policy must be robust, state space is continuous. Preliminary results performed on a real robot suggest that on-line reinforcement learning, matching some specific solutions, can be effective also in real-world physical environments.

References

Bellman, R. (1957). Dynamic Programming. Princeton University Press, Princeton.
Bonarini, A., Matteucci, M., Restelli, M., and Sorrenti, D. G. (2006). Milan robocup team 2006. In RoboCup2006: Robot Soccer World Cup X.
Buffet, O. and Aberdeen, D. (2006). Policy-gradient for robust planning. In Proceedings of the Workshop on Planning, Learning and Monitoring with Uncertainty and Dynamic Worlds (ECAI 2006).
Gaskett, C. (2003). Reinforcement learning under circumstances beyond its control. In Proceedings of International Conference on Computational Intelligence for Modelling Control and Automation.
Heger, M. (1994). Consideration of risk in reinforcement learning. In Proceedings of the 11th ICML, pages 105-111.
Kitano, H., Asada, M., Osawa, E., Noda, I., Kuniyoshi, Y., and Matsubara, H. (1997). Robocup: The robot world cup initiative. In Proceedings of the First International Conference on Autonomous Agent (Agent-97).
Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3-4):293-321.
Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th ICML, pages 157-163.
Millán, J. D. R. (1996). Rapid, safe, and incremental learning of navigation strategies. IEEE Transactions on Systems, Man, and Cybernetics (B), 26(3):408-420.
Moore, A. and Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21:711-718.
Morimoto, J. and Doya, K. (2000). Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. In Proceedings of the 17th ICML.
Morimoto, J. and Doya, K. (2001). Robust reinforcement learning. In Advances in Neural Information Processing Systems 13, pages 1061-1067.
Smart, W. D. and Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Proceedings of ICRA, pages 3404-3410.
Sutton, R. S. (1996). Generalization in reinfrocement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems 8, pages 1038-1044.
Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
Sutton, R. S., Precup, D., and Singh, S. (1999). Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1-2):181-211.
Tesauro, G. (1995). Temporal difference learning and tdgammon. Communications of the ACM, 38.
Watkins, C. and Dayan, P. (1992). Q-learning. Machine Learning, 8:279-292.

Download

Paper Citation

in Harvard Style

Bonarini A., Lazaric A. and Restelli M. (2007). PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS . In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-972-8865-82-5, pages 214-221. DOI: 10.5220/0001649102140221

in Bibtex Style

@conference{icinco07,
author={Andrea Bonarini and Alessandro Lazaric and Marcello Restelli},
title={PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS},
booktitle={Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2007},
pages={214-221},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001649102140221},
isbn={978-972-8865-82-5},
}

in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS
SN - 978-972-8865-82-5
AU - Bonarini A.
AU - Lazaric A.
AU - Restelli M.
PY - 2007
SP - 214
EP - 221
DO - 10.5220/0001649102140221