CAV (1), volume 13371 of Lecture Notes in Computer
Science, pages 193–218. Springer.
Jothimurugan, K., Bansal, S., Bastani, O., and Alur, R.
(2022). Specification-guided learning of nash equi-
libria with high social welfare. In CAV (2), volume
13372 of Lecture Notes in Computer Science, pages
343–363. Springer.
Kaelbling, L. P., Littman, M. L., and Moore, A. W. (1996).
Reinforcement learning: A survey. Journal of artifi-
cial intelligence research, 4:237–285.
Khan, A., Zhang, C., Li, S., Wu, J., Schlotfeldt, B., Tang,
S. Y., Ribeiro, A., Bastani, O., and Kumar, V. (2019).
Learning safe unlabeled multi-robot planning with
motion constraints. In IROS, pages 7558–7565. IEEE.
Kucera, A. (2011). Turn-based stochastic games. In Lec-
tures in Game Theory for Computer Scientists, pages
146–184. Cambridge University Press.
Kwiatkowska, M., Norman, G., and Parker, D. (2019). Ver-
ification and control of turn-based probabilistic real-
time games. In The Art of Modelling Computational
Systems, volume 11760 of Lecture Notes in Computer
Science, pages 379–396. Springer.
Kwiatkowska, M., Norman, G., Parker, D., and San-
tos, G. (2022). Symbolic verification and strategy
synthesis for turn-based stochastic games. CoRR,
abs/2211.06141.
Kwiatkowska, M., Parker, D., and Wiltsche, C. (2018).
PRISM-games: verification and strategy synthesis for
stochastic multi-player games with multiple objec-
tives. Int. J. Softw. Tools Technol. Transf., 20(2):195–
210.
Kwiatkowska, M. Z., Norman, G., and Parker, D. (2011).
PRISM 4.0: Verification of probabilistic real-time sys-
tems. In CAV, volume 6806 of Lecture Notes in Com-
puter Science, pages 585–591. Springer.
Lee, S. and Togelius, J. (2017). Showdown AI competition.
In CIG, pages 191–198. IEEE.
Li, J., Zhou, Y., Ren, T., and Zhu, J. (2020). Exploration
analysis in finite-horizon turn-based stochastic games.
In UAI, volume 124 of Proceedings of Machine Learn-
ing Research, pages 201–210. AUAI Press.
Littman, M. L., Topcu, U., Fu, J., Isbell, C., Wen, M., and
MacGlashan, J. (2017). Environment-independent
task specifications via GLTL. CoRR, abs/1704.04341.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M. A.
(2013). Playing atari with deep reinforcement learn-
ing. CoRR, abs/1312.5602.
Nam, S., Hsueh, C., and Ikeda, K. (2022). Generation
of game stages with quality and diversity by rein-
forcement learning in turn-based RPG. IEEE Trans.
Games, 14(3):488–501.
Pagalyte, E., Mancini, M., and Climent, L. (2020). Go with
the flow: Reinforcement learning in turn-based battle
video games. In IVA, pages 44:1–44:8. ACM.
Riley, J., Calinescu, R., Paterson, C., Kudenko, D., and
Banks, A. (2021a). Reinforcement learning with
quantitative verification for assured multi-agent poli-
cies. In 13th International Conference on Agents and
Artificial Intelligence. York.
Riley, J., Calinescu, R., Paterson, C., Kudenko, D., and
Banks, A. (2021b). Utilising assured multi-agent re-
inforcement learning within safety-critical scenarios.
In KES, volume 192 of Procedia Computer Science,
pages 1061–1070. Elsevier.
Sadigh, D., Kim, E. S., Coogan, S., Sastry, S. S., and Se-
shia, S. A. (2014). A learning based approach to con-
trol synthesis of Markov decision processes for linear
temporal logic specifications. In CDC, pages 1091–
1096. IEEE.
Shahrampour, S., Rakhlin, A., and Jadbabaie, A. (2017).
Multi-armed bandits in multi-agent networks. In
ICASSP, pages 2786–2790. IEEE.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L.,
van den Driessche, G., Schrittwieser, J., Antonoglou,
I., Panneershelvam, V., Lanctot, M., Dieleman, S.,
Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I.,
Lillicrap, T. P., Leach, M., Kavukcuoglu, K., Graepel,
T., and Hassabis, D. (2016). Mastering the game of
go with deep neural networks and tree search. Nat.,
529(7587):484–489.
Svelch, J. (2020). Should the monster play fair?: Recep-
tion of artificial intelligence in Alien: Isolation. Game
Stud., 20(2).
Vamplew, P., Smith, B. J., K
¨
allstr
¨
om, J., de Oliveira Ramos,
G., Radulescu, R., Roijers, D. M., Hayes, C. F.,
Heintz, F., Mannion, P., Libin, P. J. K., Dazeley, R.,
and Foale, C. (2022). Scalar reward is not enough:
a response to silver, singh, precup and sutton (2021).
Auton. Agents Multi Agent Syst., 36(2):41.
Videga
´
ın, S. and Garc
´
ıa-S
´
anchez, P. (2021). Performance
study of minimax and reinforcement learning agents
playing the turn-based game iwoki. Appl. Artif. Intell.,
35(10):717–744.
Wang, Y., Roohi, N., West, M., Viswanathan, M., and
Dullerud, G. E. (2020). Statistically model checking
PCTL specifications on Markov decision processes
via reinforcement learning. In CDC, pages 1392–
1397. IEEE.
Wender, S. and Watson, I. D. (2008). Using reinforcement
learning for city site selection in the turn-based strat-
egy game civilization IV. In CIG, pages 372–377.
IEEE.
Wong, A., B
¨
ack, T., Kononova, A. V., and Plaat, A. (2022).
Deep multiagent reinforcement learning: Challenges
and directions. Artificial Intelligence Review, pages
1–34.
Turn-Based Multi-Agent Reinforcement Learning Model Checking
987