Hierarchical Reinforcement Learning Introducing Genetic Algorithm for POMDPs Environments

Kohei Suzuki, Shohei Kato

2019

Abstract

Perceptual aliasing is one of the major problems in applying reinforcement learning to the real world. Perceptual aliasing occurs in the POMDPs environment, where agents cannot observe states correctly, which makes reinforcement learning unsuccessful. HQ-learning is cited as a solution to perceptual aliasing. HQ-learning solves perceptual aliasing by using subgoals and subagent. However, subagents learn independently and have to relearn each time when subgoals change. In addition, the number of subgoals is fixed, and the number of episodes in reinforcement learning increases unless the number of subgoals is appropriate. In this paper, we propose the reinforcement learning method that generates subgoals using genetic algorithm. We also report the effectiveness of our method by some experiments with partially observable mazes.

Download


Paper Citation


in Harvard Style

Suzuki K. and Kato S. (2019). Hierarchical Reinforcement Learning Introducing Genetic Algorithm for POMDPs Environments.In Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-350-6, pages 318-327. DOI: 10.5220/0007405403180327


in Bibtex Style

@conference{icaart19,
author={Kohei Suzuki and Shohei Kato},
title={Hierarchical Reinforcement Learning Introducing Genetic Algorithm for POMDPs Environments},
booktitle={Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2019},
pages={318-327},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007405403180327},
isbn={978-989-758-350-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Hierarchical Reinforcement Learning Introducing Genetic Algorithm for POMDPs Environments
SN - 978-989-758-350-6
AU - Suzuki K.
AU - Kato S.
PY - 2019
SP - 318
EP - 327
DO - 10.5220/0007405403180327