Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller,
M., Fidjeland, A. K., Ostrovski, G., Petersen, S.,
Beattie, C., Sadik, A., Antonoglou, I., King, H., Ku-
maran, D., Wierstra, D., Legg, S., and Hassabis,
D. (2015). Human-level control through deep re-
inforcement learning. Nature, 518(7540):529–533.
Bandiera abtest: a Cg type: Nature Research Jour-
nals Number: 7540 Primary atype: Research Pub-
lisher: Nature Publishing Group Subject term: Com-
puter science Subject term id: computer-science.
OpenAI, Berner, C., Brockman, G., Chan, B., Cheung,
V., Debiak, P., Dennison, C., Farhi, D., Fischer, Q.,
Hashme, S., Hesse, C., J
´
ozefowicz, R., Gray, S., Ols-
son, C., Pachocki, J., Petrov, M., Pinto, H. P. d. O.,
Raiman, J., Salimans, T., Schlatter, J., Schneider, J.,
Sidor, S., Sutskever, I., Tang, J., Wolski, F., and
Zhang, S. (2019). Dota 2 with Large Scale Deep Re-
inforcement Learning. arXiv:1912.06680 [cs, stat].
arXiv: 1912.06680.
Perolat, J., Leibo, J. Z., Zambaldi, V., Beattie, C., Tuyls,
K., and Graepel, T. (2017). A multi-agent reinforce-
ment learning model of common-pool resource ap-
propriation. arXiv:1707.06600 [cs, q-bio]. arXiv:
1707.06600.
Phan, T., Belzner, L., Gabor, T., and Schmid, K.
(2018). Leveraging Statistical Multi-Agent Online
Planning with Emergent Value Function Approxima-
tion. arXiv:1804.06311 [cs]. arXiv: 1804.06311.
Schmid, K., Belzner, L., Gabor, T., and Phan, T. (2018).
Action Markets in Deep Multi-Agent Reinforcement
Learning. In K
˚
urkov
´
a, V., Manolopoulos, Y., Ham-
mer, B., Iliadis, L., and Maglogiannis, I., editors,
Artificial Neural Networks and Machine Learning –
ICANN 2018, Lecture Notes in Computer Science,
pages 240–249, Cham. Springer International Pub-
lishing.
Schmid, K., Belzner, L., and Linnhoff-Popien, C. (2021a).
Learning to penalize other learning agents. MIT Press.
Schmid, K., Belzner, L., M
¨
uller, R., Tochtermann, J.,
and Linnhoff-Popien, C. (2021b). Stochastic Market
Games. In Zhou, Z.-H., editor, Proceedings of the
Thirtieth International Joint Conference on Artificial
Intelligence, IJCAI-21, pages 384–390. International
Joint Conferences on Artificial Intelligence Organiza-
tion.
Schmid, K., Belzner, L., Phan, T., Gabor, T., and Linnhoff-
Popien, C. (2020). Multi-agent Reinforcement Learn-
ing for Bargaining under Risk and Asymmetric Infor-
mation. Pages: 151.
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I.,
Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M.,
Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L.,
van den Driessche, G., Graepel, T., and Hassabis, D.
(2017). Mastering the game of Go without human
knowledge. Nature, 550(7676):354–359. Bandiera -
abtest: a Cg type: Nature Research Journals Num-
ber: 7676 Primary atype: Research Publisher: Na-
ture Publishing Group Subject term: Computational
science;Computer science;Reward Subject term id:
computational-science;computer-science;reward.
Statman, M. (2004). The Diversification Puzzle. Financial
Analysts Journal, 60(4):44–53. Publisher: Routledge
eprint: https://doi.org/10.2469/faj.v60.n4.2636.
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu,
M., Dudzik, A., Chung, J., Choi, D. H., Powell, R.,
Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss,
M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Aga-
piou, J. P., Jaderberg, M., Vezhnevets, A. S., Leblond,
R., Pohlen, T., Dalibard, V., Budden, D., Sulsky, Y.,
Molloy, J., Paine, T. L., Gulcehre, C., Wang, Z.,
Pfaff, T., Wu, Y., Ring, R., Yogatama, D., W
¨
unsch,
D., McKinney, K., Smith, O., Schaul, T., Lillicrap,
T., Kavukcuoglu, K., Hassabis, D., Apps, C., and
Silver, D. (2019). Grandmaster level in StarCraft
II using multi-agent reinforcement learning. Nature,
575(7782):350–354. Bandiera abtest: a Cg type: Na-
ture Research Journals Number: 7782 Primary atype:
Research Publisher: Nature Publishing Group Sub-
ject term: Computer science;Statistics Subject term -
id: computer-science;statistics.
Wang, J. X., Hughes, E., Fernando, C., Czarnecki, W. M.,
Duenez-Guzman, E. A., and Leibo, J. Z. (2019).
Evolving intrinsic motivations for altruistic behavior.
arXiv:1811.05931 [cs]. arXiv: 1811.05931.
Yang, J., Li, A., Farajtabar, M., Sunehag, P., Hughes, E.,
and Zha, H. (2020). Learning to Incentivize Other
Learning Agents. arXiv:2006.06051 [cs, stat]. arXiv:
2006.06051.
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence
362