PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS
Andrea Bonarini, Alessandro Lazaric, Marcello Restelli
2007
Abstract
Writing good behaviors for mobile robots is a hard task that requires a lot of hand tuning and often fails to consider all the possible configurations that a robot may face. By using reinforcement learning techniques a robot can improve its performance through a direct interaction with the surrounding environment and adapt its behavior in response to some non-stationary events, thus achieving a higher degree of autonomy with respect to pre-programmed robots. In this paper, we propose a novel reinforcement learning approach that addresses the main issues of learning in real-world robotic applications: experience is expensive, explorative actions are risky, control policy must be robust, state space is continuous. Preliminary results performed on a real robot suggest that on-line reinforcement learning, matching some specific solutions, can be effective also in real-world physical environments.
References
- Bellman, R. (1957). Dynamic Programming. Princeton University Press, Princeton.
- Bonarini, A., Matteucci, M., Restelli, M., and Sorrenti, D. G. (2006). Milan robocup team 2006. In RoboCup2006: Robot Soccer World Cup X.
- Buffet, O. and Aberdeen, D. (2006). Policy-gradient for robust planning. In Proceedings of the Workshop on Planning, Learning and Monitoring with Uncertainty and Dynamic Worlds (ECAI 2006).
- Gaskett, C. (2003). Reinforcement learning under circumstances beyond its control. In Proceedings of International Conference on Computational Intelligence for Modelling Control and Automation.
- Heger, M. (1994). Consideration of risk in reinforcement learning. In Proceedings of the 11th ICML, pages 105-111.
- Kitano, H., Asada, M., Osawa, E., Noda, I., Kuniyoshi, Y., and Matsubara, H. (1997). Robocup: The robot world cup initiative. In Proceedings of the First International Conference on Autonomous Agent (Agent-97).
- Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3-4):293-321.
- Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th ICML, pages 157-163.
- Millán, J. D. R. (1996). Rapid, safe, and incremental learning of navigation strategies. IEEE Transactions on Systems, Man, and Cybernetics (B), 26(3):408-420.
- Moore, A. and Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21:711-718.
- Morimoto, J. and Doya, K. (2000). Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. In Proceedings of the 17th ICML.
- Morimoto, J. and Doya, K. (2001). Robust reinforcement learning. In Advances in Neural Information Processing Systems 13, pages 1061-1067.
- Smart, W. D. and Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Proceedings of ICRA, pages 3404-3410.
- Sutton, R. S. (1996). Generalization in reinfrocement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems 8, pages 1038-1044.
- Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
- Sutton, R. S., Precup, D., and Singh, S. (1999). Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1-2):181-211.
- Tesauro, G. (1995). Temporal difference learning and tdgammon. Communications of the ACM, 38.
- Watkins, C. and Dayan, P. (1992). Q-learning. Machine Learning, 8:279-292.
Paper Citation
in Harvard Style
Bonarini A., Lazaric A. and Restelli M. (2007). PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS . In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-972-8865-82-5, pages 214-221. DOI: 10.5220/0001649102140221
in Bibtex Style
@conference{icinco07,
author={Andrea Bonarini and Alessandro Lazaric and Marcello Restelli},
title={PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS},
booktitle={Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2007},
pages={214-221},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001649102140221},
isbn={978-972-8865-82-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - PIECEWISE CONSTANT REINFORCEMENT LEARNING FOR ROBOTIC APPLICATIONS
SN - 978-972-8865-82-5
AU - Bonarini A.
AU - Lazaric A.
AU - Restelli M.
PY - 2007
SP - 214
EP - 221
DO - 10.5220/0001649102140221