
Justesen, N., Torrado, R. R., Bontrager, P., Khalifa, A., To-
gelius, J., and Risi, S. (2018). Illuminating general-
ization in deep reinforcement learning through proce-
dural level generation. In NeurIPS 2018 Workshop on
Deep RL.
Kharyal, C., Krishna Gottipati, S., Kumar Sinha, T., Das,
S., and Taylor, M. E. (2024). GLIDE-RL: Grounded
language instruction through DEmonstration in RL.
https://arxiv.org/abs/2401.02991.
Kirk, R., Zhang, A., Grefenstette, E., and Rocktäschel,
T. (2023). A survey of zero-shot generalisation in
deep reinforcement learning. Journal Artif. Intell. Re-
search, 76:201–264.
Koyamada, S., Okano, S., Nishimori, S., Murata, Y.,
Habara, K., Kita, H., and Ishii, S. (2023). Pgx:
Hardware-accelerated parallel game simulators for re-
inforcement learning. In Adv. Neural Inf. Proc. Syst.
Kuhlmann, G. and Stone, P. (2007). Graph-based do-
main mapping for transfer learning in general games.
In Kok, J., Koronacki, J., Mantaras, R., Matwin,
S., Mladeni
ˇ
c, D., and Skowron, A., editors, Mach.
Learn.: ECML 2007, volume 4071 of LNCS, pages
188–200. Springer, Berlin, Heidelberg.
Lange, R. T. (2022). gymnax: A JAX-based reinforcement
learning environment library.
Lange, S. and Riedmiller, M. (2010). Deep auto-encoder
neural networks in reinforcement learning. In Neu-
ral Networks. Int. Joint Conf. 2010, pages 1623–1630.
IEEE.
Lee, J. N., Xie, A., Pacchiano, A., Chandak, Y., Finn, C.,
Nachum, O., and Brunskill, E. (2023). Supervised
pretraining can learn in-context reinforcement learn-
ing. https://arxiv.org/abs/2306.14892.
Lifschitz, S., Paster, K., Chan, H., Ba, J., and McIlraith,
S. (2023). Steve-1: A generative model for text-
to-behavior in Minecraft. https://arxiv.org/abs/2306.
00937.
Love, N., Hinrichs, T., Haley, D., Schkufza, E., and Gene-
sereth, M. (2008). General game playing: Game de-
scription language specification. Technical Report
LG-2006-01, Stanford Logic Group.
Luketina, J., Nardelli, N., Farquhar, G., Foerster, J.,
Andreas, J., Grefenstette, E., Whiteson, S., and
Rocktä"schel, T. (2019). A survey of reinforcement
learning informed by natural language. In Proc. 28th
Int. Joint Conf. Artif. Intell., IJCAI-19, pages 6309–
6317.
Machado, M. C., Bellemare, M. G., Talvitie, E., Veness, J.,
Hausknecht, M., and Bowling, M. (2018). Revisiting
the arcade learning environment: Evaluation protocols
and open problems for general agents. Journal Artif.
Intell. Research, 61:523–562.
Malik, D., Li, Y., and Ravikumar, P. (2021). When is gen-
eralizable reinforcement learning tractable? In Ran-
zato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and
Vaughan, J. W., editors, Adv. Neural Inf. Proc. Syst.,
volume 34, pages 8032–8045. Curran Associates, Inc.
Mannor, S. and Tamar, A. (2023). Towards deployable RL
– what’s broken with RL research and a potential fix.
https://arxiv.org/abs/2301.01320.
Maras, M., K˛epa, M., Kowalski, J., and Szykuła, M. (2024).
Fast and knowledge-free deep learning for general
game playing (student abstract). In Proc. AAAI Conf.
Artif. Intell., volume 38, pages 23576–23578.
Marcus, G., Leivada, E., and Murphy, E. (2023). A sen-
tence is worth a thousand pictures: Can large lan-
guage models understand human language? https:
//arxiv.org/abs/2308.00109.
McDermott, D., Ghallab, M., Howe, A., Knoblock, C.,
Ram, A., Veloso, M., Weld, D., and Wilkins, D.
(1998). PDDL—the planning domain definition
language. Technical Report CVC TR98003/DCS
TR1165, New Haven, CT: Yale Center for Computa-
tional Vision and Control.
Mernik, M., Heering, J., and Sloane, A. M. (2005). When
and how to develop domain-specific languages. ACM
Computing Surveys, 37(4):316–344.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M.
(2013). Playing Atari with deep reinforcement learn-
ing. https://arxiv.org/abs/1312.5602.
Nichol, A., Pfau, V., Hesse, C., Klimov, O., and Schulman,
J. (2018). Gotta learn fast: A new benchmark for gen-
eralization in RL. https://arxiv.org/abs/1804.03720.
OpenAI (2022). Introducing ChatGPT. https://openai.com/
blog/chatgpt. Accessed: 2024-01-02.
Oswald, J., Srinivas, K., Kokel, H., Lee, J., Katz, M., and
Sohrabi, S. (2024). Large language models as plan-
ning domain generators. In Proc. Int. Conf. Automated
Planning and Sched., volume 34, pages 423–431.
Parker-Holder, J., Rajan, R., Song, X., Biedenkapp, A.,
Miao, Y., Eimer, T., Zhang, B., Nguyen, V., Calandra,
R., Faust, A., Hutter, F., and Lindauer, M. (2022). Au-
tomated reinforcement learning (autoRL): A survey
and open problems. Journal Artif. Intell. Research,
74:517–568.
Patterson, A., Neumann, S., White, M., and White, A.
(2023). Empirical design in reinforcement learning.
https://arxiv.org/abs/2304.01315.
Piette, É., Soemers, D. J. N. J., Stephenson, M., Sironi,
C. F., Winands, M. H. M., and Browne, C. (2020).
Ludii – the ludemic general game system. In Gia-
como, G. D., Catala, A., Dilkina, B., Milano, M.,
Barro, S., Bugarín, A., and Lang, J., editors, Proc.
24th Eur. Conf. Artif. Intell., volume 325 of Frontiers
in Artificial Intell. and Appl., pages 411–418. IOS
Press.
Raparthy, S. C., Hambro, E., Kirk, R., Henaff, M., and
Raileanu, R. (2023). Generalization to new sequen-
tial decision making tasks with in-context learning.
https://arxiv.org/abs/2312.03801.
Reed, S.,
˙
Zołna, K., Parisotto, E., Colmenarejo, S. G.,
Novikov, A., Barth-Maron, G., Giménez, M., Sulsky,
Y., Kay, J., Springenberg, J. T., Eccles, T., Bruce, J.,
Razavi, A., Edwards, A., Heess, N., Chen, Y., Had-
sell, R., Vinyals, O., Bordbar, M., and de Freitas, N.
(2023). A generalist agent. Trans. Mach. Learn. Re-
search.
Riedmiller, M. (2005). Neural fitted Q iteration - first ex-
periences with a data efficient neural reinforcement
Environment Descriptions for Usability and Generalisation in Reinforcement Learning
991