SCALED GRADIENT DESCENT LEARNING RATE - Reinforcement learning with light-seeking robot

Kary Främling

Abstract

Adaptive behaviour through machine learning is challenging in many real-world applications such as robotics. This is because learning has to be rapid enough to be performed in real time and to avoid damage to the robot. Models using linear function approximation are interesting in such tasks because they offer rapid learning and have small memory and processing requirements. Adalines are a simple model for gradient descent learning with linear function approximation. However, the performance of gradient descent learning even with a linear model greatly depends on identifying a good value for the learning rate to use. In this paper it is shown that the learning rate should be scaled as a function of the current input values. A scaled learning rate makes it possible to avoid weight oscillations without slowing down learning. The advantages of using the scaled learning rate are illustrated using a robot that learns to navigate towards a light source. This light-seeking robot performs a Reinforcement Learning task, where the robot collects training samples by exploring the environment, i.e. taking actions and learning from their result by a trial-and-error procedure.

References

  1. Barto, A.G., Sutton, R.S., Watkins C.J.C.H. (1990). Learning and Sequential Decision Making. In M. Gabriel and J. Moore (eds.), Learning and computational neuroscience : foundations of adaptive networks. M.I.T. Press.
  2. Boyan, J. A., Moore, A. W. (1995). Generalization in Reinforcement Learning: Safely Approximating the Value Function. In Tesauro, G., Touretzky, D., Leen, T. (eds.), NIPS'1994 proc., Vol. 7. MIT Press, 369- 376.
  3. Haykin, S. (1999). Neural Networks - a comprehensive foundation. Prentice-Hall, New Jersey, USA.
  4. Kaelbling, L.P., Littman, M.L., Moore, A.W. (1996). Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research, Vol. 4, 237-285.
  5. Lebeltel, O., Bessière, P., Diard, J., Mazer, E. (2004). Bayesian Robot Programming. Autonomous Robots, Vol. 16, 49-79.
  6. Lin, L.-J. (1991). Programming robots using reinforcement learning and teaching. In Proc. of the Ninth National Conference on Artificial Intelligence (AAAI), 781-786.
  7. Luo, Z. (1991). On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks. Neural Computation, Vol. 3, 226-245.
  8. Mahadevan, S., Connell, J. (1992). Automatic Programming of Behavior-based Robots using Reinforcement Learning. Artificial Intelligence, Vol. 55, Nos. 2-3, 311-365.
  9. Rumelhart, D. E., McClelland, J. L. et al. (1988). Parallel Distributed Processing Vol. 1. MIT Press, Massachusetts.
  10. Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, Vol. 3, 9-44.
  11. Sutton, R.S., Barto, A.G. (1998). Reinforcement Learning. MIT Press, Cambridge, MA.
  12. Thrun, S.B. (1992). The role of exploration in learning control. In DA White & DA Sofge, (eds.), Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches. Van Nostrand Reinhold, New York.
  13. Widrow, B., Hoff, M.E. (1960). Adaptive switching circuits. 1960 WESCON Convention record Part IV, Institute of Radio Engineers, New York, 96-104.
Download


Paper Citation


in Harvard Style

Främling K. (2004). SCALED GRADIENT DESCENT LEARNING RATE - Reinforcement learning with light-seeking robot . In Proceedings of the First International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 972-8865-12-0, pages 3-11. DOI: 10.5220/0001138600030011


in Bibtex Style

@conference{icinco04,
author={Kary Främling},
title={SCALED GRADIENT DESCENT LEARNING RATE - Reinforcement learning with light-seeking robot},
booktitle={Proceedings of the First International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2004},
pages={3-11},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001138600030011},
isbn={972-8865-12-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - SCALED GRADIENT DESCENT LEARNING RATE - Reinforcement learning with light-seeking robot
SN - 972-8865-12-0
AU - Främling K.
PY - 2004
SP - 3
EP - 11
DO - 10.5220/0001138600030011