Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning
Yongshuai Liu, Xin Liu
2023
Abstract
Uncertainty-based exploration in deep reinforcement learning (RL) and deep multi-agent reinforcement learning (MARL) plays a key role in improving sample efficiency and boosting total reward. Uncertainty-based exploration methods often measure the uncertainty (variance) of the value function; However, existing exploration strategies either underestimate the uncertainty by only considering the local uncertainty of the next immediate reward or estimate the uncertainty by propagating the uncertainty for all the remaining steps in an episode. Neither approach can explicitly control the bias-variance trade-off of the value function. In this paper, we propose Farsighter, an explicit multi-step uncertainty exploration framework. Specifically, Farsighter considers the uncertainty of exact k future steps and it can adaptively adjust k. In practice, we learn Bayesian posterior over Q-function in discrete cases and over action in continuous cases to approximate uncertainty in each step and recursively deploy Thompson sampling on the learned posterior distribution with TD(k) update. Our method can work on general tasks with high/low-dimensional states, discrete/continuous actions, and sparse/dense rewards. Empirical evaluations show that Farsighter outperforms SOTA explorations on a wide range of Atari games, robotic manipulation tasks, and general RL tasks.
DownloadPaper Citation
in Harvard Style
Liu Y. and Liu X. (2023). Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-623-1, pages 380-391. DOI: 10.5220/0011800600003393
in Bibtex Style
@conference{icaart23,
author={Yongshuai Liu and Xin Liu},
title={Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning},
booktitle={Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2023},
pages={380-391},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011800600003393},
isbn={978-989-758-623-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning
SN - 978-989-758-623-1
AU - Liu Y.
AU - Liu X.
PY - 2023
SP - 380
EP - 391
DO - 10.5220/0011800600003393