Authors:
Tomihiro Kimura
and
Ikeda Kokolo
Affiliation:
Japan Advanced Institute of Science and Technology, JAIST, Ishikawa, Japan
Keyword(s):
Turn-based Strategy Games, Deep Neural Network, Deep Reinforcement Learning, Policy Network, Value Network, AlphaZero, Residual Network.
Abstract:
The development of AlphaGo has increased the interest of researchers in applying deep learning and reinforcement learning to games. However, using the AlphaZero algorithm on games with complex data structures and vast search space, such as turn-based strategy games, has some technical challenges. The problem involves performing complex data representations with neural networks, which results in a very long learning time. This study discusses methods that can accelerate the learning of neural networks by solving the problem of the data representation of neural networks using a search tree. The proposed algorithm performs better than existing methods such as the Monte Carlo Tree Search (MCTS). The automatic generation of learning data by self-play does not require a big learning database beforehand. Moreover, the algorithm also shows excellent match results with a win rate of more than 85% against the conventional algorithms in the new map which is not used for learning.