An ML Agent using the Policy Gradient Method to win a SoccerTwos Game
Victor Pugliese
2022
Abstract
We conducted an investigative study of Policy Gradient methods using Curriculum Learning applied in Video Games, as professors at the Federal University of Goiás created a customized SoccerTwos environment to evaluate the Machine Learning agents of students in a Reinforcement Learning course. We employed the PPO and SAC as state-of-arts in on-policy and off-policy contexts, respectively. Also, the Curriculum could improve the performance based on it is easier to teach people in a complex gradual order than randomly. So, combining them, we propose our agents win more matches than their adversaries. We measured the results by minimum, maximum, mean rewards, and the mean length per episode in checkpoints. Finally, PPO achieved the best result with Curriculum Learning, modifying players’ (position and rotation) and ball’s (speed and position) settings in time intervals. Also, It used fewer training hours than other experiments.
DownloadPaper Citation
in Harvard Style
Pugliese V. (2022). An ML Agent using the Policy Gradient Method to win a SoccerTwos Game. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-569-2, pages 628-633. DOI: 10.5220/0011108400003179
in Bibtex Style
@conference{iceis22,
author={Victor Pugliese},
title={An ML Agent using the Policy Gradient Method to win a SoccerTwos Game},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2022},
pages={628-633},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011108400003179},
isbn={978-989-758-569-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - An ML Agent using the Policy Gradient Method to win a SoccerTwos Game
SN - 978-989-758-569-2
AU - Pugliese V.
PY - 2022
SP - 628
EP - 633
DO - 10.5220/0011108400003179