Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game
Takahiro Morita, Hiroshi Hosobe
2022
Abstract
This paper proposes upper confident bound (UCB) fuzzy Q-learning by combining fuzzy Q-learning and the UCBQ algorithm and applies it to a video game. The UCBQ algorithm improved the action selection method called the UCB algorithm by applying it to Q-learning. The UCB algorithm selects the action with the highest UCB value instead of a value estimate. Since the UCB algorithm is based on the premise that any unselected actions are selected and value estimates are obtained, the number of unselected actions becomes small, and it is able to prevent local optimal solutions. The proposed method aims to promote the efficiency of learning by reducing unselected actions and preventing the Q value from becoming a local optimal solution in fuzzy Q-learning. This paper applies the proposed method to a video game called Ms. PacMan and presents the result of an experiment on finding optimum values in the method. Its evaluation is conducted by comparing the game scores with the scores obtained by a previous fuzzy Q-learning method. The result shows that the proposed method significantly reduced unselected actions.
DownloadPaper Citation
in Harvard Style
Morita T. and Hosobe H. (2022). Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, ISBN 978-989-758-547-0, pages 454-461. DOI: 10.5220/0010835700003116
in Bibtex Style
@conference{icaart22,
author={Takahiro Morita and Hiroshi Hosobe},
title={Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game},
booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART,},
year={2022},
pages={454-461},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010835700003116},
isbn={978-989-758-547-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART,
TI - Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game
SN - 978-989-758-547-0
AU - Morita T.
AU - Hosobe H.
PY - 2022
SP - 454
EP - 461
DO - 10.5220/0010835700003116