INTERNALLY DRIVEN Q-LEARNING - Convergence and Generalization Results
Eduardo Alonso, Esther Mondragón, Niclas Kjäll-Ohlsson
We present an approach to solving the reinforcement learning problem in which agents are provided with internal drives against which they evaluate the value of the states according to a similarity function. We extend Q-learning by substituting internally driven values for ad hoc rewards. The resulting algorithm, Internally Driven Q-learning (IDQ-learning), is experimentally proved to convergence to optimality and to generalize well. These results are preliminary yet encouraging: IDQ-learning is more psychologically plausible than Q-learning, and it devolves control and thus autonomy to agents that are otherwise at the mercy of the environment (i.e., of the designer).
- Sutton, R. S., and Barto, A. G., 1998. Reinforcement Learning: An Introduction. The MIT Press, Cambridge, MA.
- Watkins, C. J. C. H., and Dayan, P., 1992. Q-learning. Machine Learning, 8, 279-292.
Paper Citation
in Harvard Style
Alonso E., Mondragón E. and Kjäll-Ohlsson N. (2012). INTERNALLY DRIVEN Q-LEARNING - Convergence and Generalization Results . In Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-95-9, pages 491-494. DOI: 10.5220/0003736404910494
in Bibtex Style
author={Eduardo Alonso and Esther Mondragón and Niclas Kjäll-Ohlsson},
title={INTERNALLY DRIVEN Q-LEARNING - Convergence and Generalization Results},
booktitle={Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
in EndNote Style
JO - Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - INTERNALLY DRIVEN Q-LEARNING - Convergence and Generalization Results
SN - 978-989-8425-95-9
AU - Alonso E.
AU - Mondragón E.
AU - Kjäll-Ohlsson N.
PY - 2012
SP - 491
EP - 494
DO - 10.5220/0003736404910494