Using the Ornstein-Uhlenbeck Process for Random Exploration

Johannes Nauta, Yara Khaluf, Pieter Simoens

Abstract

In model-based Reinforcement Learning, an agent aims to learn a transition model between attainable states. Since the agent initially has zero knowledge of the transition model, it needs to resort to random exploration in order to learn the model. In this work, we demonstrate how the Ornstein-Uhlenbeck process can be used as a sampling scheme to generate exploratory Brownian motion in the absence of a transition model. Whereas current approaches rely on knowledge of the transition model to generate the steps of Brownian motion, the Ornstein-Uhlenbeck process does not. Additionally, the Ornstein-Uhlenbeck process naturally includes a drift term originating from a potential function. We show that this potential can be controlled by the agent itself, and allows executing non-equilibrium behavior such as ballistic motion or local trapping.

Download


Paper Citation


in Harvard Style

Nauta J., Khaluf Y. and Simoens P. (2019). Using the Ornstein-Uhlenbeck Process for Random Exploration.In Proceedings of the 4th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS, ISBN 978-989-758-366-7, pages 59-66. DOI: 10.5220/0007724500590066


in Bibtex Style

@conference{complexis19,
author={Johannes Nauta and Yara Khaluf and Pieter Simoens},
title={Using the Ornstein-Uhlenbeck Process for Random Exploration},
booktitle={Proceedings of the 4th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,},
year={2019},
pages={59-66},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007724500590066},
isbn={978-989-758-366-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 4th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,
TI - Using the Ornstein-Uhlenbeck Process for Random Exploration
SN - 978-989-758-366-7
AU - Nauta J.
AU - Khaluf Y.
AU - Simoens P.
PY - 2019
SP - 59
EP - 66
DO - 10.5220/0007724500590066