Estimation of Reward Function Maximizing Learning Efficiency in Inverse Reinforcement Learning

Yuki Kitazato, Sachiyo Arai

2018

Abstract

Inverse Reinforcement Learning (IRL) is a promising framework for estimating a reward function given the behavior of an expert.However, the IRL problem is ill-posed because infinitely many reward functions can be consistent with the expert’s observed behavior. To resolve this issue, IRL algorithms have been proposed to determine alternative choices of the reward function that reproduce the behavior of the expert, but these algorithms do not consider the learning efficiency. In this paper, we propose a new formulation and algorithm for IRL to estimate the reward function that maximizes the learning efficiency. This new formulation is an extension of an existing IRL algorithm, and we introduce a genetic algorithm approach to solve the new reward function. We show the effectiveness of our approach by comparing the performance of our proposed method against existing algorithms.

Download


Paper Citation


in Harvard Style

Kitazato Y. and Arai S. (2018). Estimation of Reward Function Maximizing Learning Efficiency in Inverse Reinforcement Learning.In Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-275-2, pages 276-283. DOI: 10.5220/0006729502760283


in Bibtex Style

@conference{icaart18,
author={Yuki Kitazato and Sachiyo Arai},
title={Estimation of Reward Function Maximizing Learning Efficiency in Inverse Reinforcement Learning},
booktitle={Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2018},
pages={276-283},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006729502760283},
isbn={978-989-758-275-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Estimation of Reward Function Maximizing Learning Efficiency in Inverse Reinforcement Learning
SN - 978-989-758-275-2
AU - Kitazato Y.
AU - Arai S.
PY - 2018
SP - 276
EP - 283
DO - 10.5220/0006729502760283