ACKNOWLEDGEMENTS
We would like to thank the CEIA/UFG professors for
providing the game environment and support in the
Reinforcement Learning course.
REFERENCES
Achiam, J. (2018). Openai spinning up. GitHub, GitHub
repository.
Albuainain, A. R. and Gatzoulis, C. (2020). Reinforcement
learning for physics-based competitive games. In
2020 International Conference on Innovation and In-
telligence for Informatics, Computing and Technolo-
gies (3ICT), pages 1–6. IEEE.
Bengio, Y., Louradour, J., Collobert, R., and Weston, J.
(2009). Curriculum learning. In Proceedings of
the 26th annual international conference on machine
learning, pages 41–48.
Camargo, J. E. and S
´
aenz, R. (2021). Evaluating the impact
of curriculum learning on the training process for an
intelligent agent in a video game. Inteligencia Artifi-
cial, 24(68):1–20.
Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018).
Soft actor-critic: Off-policy maximum entropy deep
reinforcement learning with a stochastic actor. In
International conference on machine learning, pages
1861–1870. PMLR.
Huang, R., Yu, T., Ding, Z., and Zhang, S. (2020). Policy
gradient. In Deep reinforcement learning, pages 161–
212. Springer.
Juliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J.,
Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M.,
et al. (2018). Unity: A general platform for intelligent
agents. arXiv preprint arXiv:1809.02627.
Kaelbling, L. P., Littman, M. L., and Moore, A. W. (1996).
Reinforcement learning: A survey. Journal of artifi-
cial intelligence research, 4:237–285.
Keras, F. (2022). PPO proximal policy optimization.
Lapan, M. (2018). Deep Reinforcement Learning Hands-
On: Apply modern RL methods, with deep Q-
networks, value iteration, policy gradients, TRPO, Al-
phaGo Zero and more. Packt Publishing Ltd.
Majumder, A. (2021). Competitive networks for ai agents.
In Deep Reinforcement Learning in Unity, pages 449–
511. Springer.
Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R.,
Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M. I.,
et al. (2018). Ray: A distributed framework for emerg-
ing {AI} applications. In 13th USENIX Symposium on
Operating Systems Design and Implementation (OSDI
18), pages 561–577.
Oliveira, B. (2021). A pre-compiled soccer-twos reinforce-
ment learning environment with multi-agent gym-
compatible wrappers and human-friendly visualizers.
https://github.com/bryanoliveira/soccer-twos-env.
Osipov, A. and Petrosian, O. Application of the contract-
structured gradient group learning algorithm for mod-
eling conflict-controlled multi-agent systems.
Ranjitha, M., Nathan, K., and Joseph, L. (2020). Arti-
ficial intelligence algorithms and techniques in the
computation of player-adaptive games. In Journal
of Physics: Conference Series, volume 1427, page
012006. IOP Publishing.
S
´
aenz Imbacu
´
an, R. (2021). Evaluating the impact of cur-
riculum learning on the training process for an intelli-
gent agent in a video game.
Shao, K., Tang, Z., Zhu, Y., Li, N., and Zhao, D. (2019). A
survey of deep reinforcement learning in video games.
arXiv preprint arXiv:1912.10944.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learn-
ing: An introduction. MIT press.
Tyagi, D. (2021). Reinforcement-learning: Implemen-
tations of deep reinforcement learning algorithms
and benchmarking with pytorch. https://github.com/
deepanshut041/reinforcement-learning.
An ML Agent using the Policy Gradient Method to win a SoccerTwos Game
633