Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning

Yongshuai Liu, Xin Liu

2023

Abstract

Uncertainty-based exploration in deep reinforcement learning (RL) and deep multi-agent reinforcement learning (MARL) plays a key role in improving sample efficiency and boosting total reward. Uncertainty-based exploration methods often measure the uncertainty (variance) of the value function; However, existing exploration strategies either underestimate the uncertainty by only considering the local uncertainty of the next immediate reward or estimate the uncertainty by propagating the uncertainty for all the remaining steps in an episode. Neither approach can explicitly control the bias-variance trade-off of the value function. In this paper, we propose Farsighter, an explicit multi-step uncertainty exploration framework. Specifically, Farsighter considers the uncertainty of exact k future steps and it can adaptively adjust k. In practice, we learn Bayesian posterior over Q-function in discrete cases and over action in continuous cases to approximate uncertainty in each step and recursively deploy Thompson sampling on the learned posterior distribution with TD(k) update. Our method can work on general tasks with high/low-dimensional states, discrete/continuous actions, and sparse/dense rewards. Empirical evaluations show that Farsighter outperforms SOTA explorations on a wide range of Atari games, robotic manipulation tasks, and general RL tasks.

Download


Paper Citation


in Harvard Style

Liu Y. and Liu X. (2023). Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-623-1, pages 380-391. DOI: 10.5220/0011800600003393


in Bibtex Style

@conference{icaart23,
author={Yongshuai Liu and Xin Liu},
title={Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning},
booktitle={Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2023},
pages={380-391},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011800600003393},
isbn={978-989-758-623-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Farsighter: Efficient Multi-Step Exploration for Deep Reinforcement Learning
SN - 978-989-758-623-1
AU - Liu Y.
AU - Liu X.
PY - 2023
SP - 380
EP - 391
DO - 10.5220/0011800600003393