Reinforcement Learning Considering Worst Case and Equality within Episodes
Toshihiro Matsui
2020
Abstract
Reinforcement learning has been studied as an unsupervised learning framework. The goal of standard reinforcement learning methods is to minimize the total cost or reward for the optimal policy. In several practical situations, equalization of the cost or reward values within an episode may be required. This class of problems can be considered multi-objective, where each part of an episode has individual costs or rewards that should be separately considered. In a previous study this concept was applied to search algorithms for shortest path problems. We investigate how a similar criterion considering the worst-case and equality of the objectives can be applied to the Q-learning method. Our experimental results demonstrate the effect and influence of the optimization with the criterion.
DownloadPaper Citation
in Harvard Style
Matsui T. (2020). Reinforcement Learning Considering Worst Case and Equality within Episodes. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-395-7, pages 335-342. DOI: 10.5220/0009178603350342
in Bibtex Style
@conference{icaart20,
author={Toshihiro Matsui},
title={Reinforcement Learning Considering Worst Case and Equality within Episodes},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2020},
pages={335-342},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009178603350342},
isbn={978-989-758-395-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - Reinforcement Learning Considering Worst Case and Equality within Episodes
SN - 978-989-758-395-7
AU - Matsui T.
PY - 2020
SP - 335
EP - 342
DO - 10.5220/0009178603350342