
REFERENCES
Choi, E., Lazaridou, A., and de Freitas, N. (2018). Com-
positional obverter communication learning from raw
visual input. In International Conference on Learning
Representations.
Christianos, F., Sch
¨
afer, L., and Albrecht, S. V. (2020).
Shared experience actor-critic for multi-agent rein-
forcement learning. In Advances in Neural Informa-
tion Processing Systems (NeurIPS).
Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rab-
bat, M., and Pineau, J. (2019). Tarmac: Targeted
multi-agent communication. In International Confer-
ence on Machine Learning, pages 1538–1546. PMLR.
Eccles, T., Bachrach, Y., Lever, G., Lazaridou, A., and
Graepel, T. (2019). Biases for emergent communica-
tion in multi-agent reinforcement learning. Advances
in neural information processing systems, 32.
Fitch, W. T. (2010). The Evolution of Language. Cambridge
University Press.
Foerster, J., Assael, I. A., De Freitas, N., and Whiteson, S.
(2016). Learning to communicate with deep multi-
agent reinforcement learning. Advances in neural in-
formation processing systems, 29.
Hansen, E. A., Bernstein, D. S., and Zilberstein, S.
(2004). Dynamic programming for partially observ-
able stochastic games. In AAAI, volume 4, pages 709–
715.
Hornik, K., Feinerer, I., Kober, M., and Buchta, C. (2012).
Spherical k-means clustering. Journal of statistical
software, 50:1–22.
Lin, T., Huh, J., Stauffer, C., Lim, S. N., and Isola, P. (2021).
Learning to ground multi-agent communication with
autoencoders. Advances in Neural Information Pro-
cessing Systems, 34:15230–15242.
Littman, M. L. (1994). Markov games as a framework
for multi-agent reinforcement learning. In Machine
learning proceedings 1994, pages 157–163. Elsevier.
Lloyd, S. (1982). Least squares quantization in pcm. IEEE
transactions on information theory, 28(2):129–137.
Lowe, R., Wu, Y. I., Tamar, A., Harb, J., Pieter Abbeel,
O., and Mordatch, I. (2017). Multi-agent actor-critic
for mixed cooperative-competitive environments. Ad-
vances in neural information processing systems, 30.
Mordatch, I. and Abbeel, P. (2018). Emergence of grounded
compositional language in multi-agent populations. In
Proceedings of the AAAI Conference on Artificial In-
telligence, volume 32.
Nguyen, T. Q. and Salazar, J. (2019). Transformers without
tears: Improving the normalization of self-attention.
In Proceedings of the 16th International Conference
on Spoken Language Translation.
Papoudakis, G., Christianos, F., Sch
¨
afer, L., and Al-
brecht, S. V. (2021). Benchmarking multi-agent
deep reinforcement learning algorithms in coopera-
tive tasks. In Thirty-fifth Conference on Neural Infor-
mation Processing Systems Datasets and Benchmarks
Track (Round 1).
Salzman, O. and Stern, R. (2020). Research challenges and
opportunities in multi-agent path finding and multi-
agent pickup and delivery problems. In Proceedings
of the 19th International Conference on Autonomous
Agents and MultiAgent Systems, pages 1711–1715.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and
Klimov, O. (2017). Proximal policy optimization al-
gorithms. arXiv preprint arXiv:1707.06347.
Sculley, D. (2010). Web-scale k-means clustering. In
Proceedings of the 19th international conference on
World wide web, pages 1177–1178.
Shalev-Shwartz, S., Shammah, S., and Shashua, A. (2016).
Safe, multi-agent, reinforcement learning for au-
tonomous driving. CoRR, abs/1610.03295.
Shapley, L. S. (1953). Stochastic games. Proceedings of the
national academy of sciences, 39(10):1095–1100.
Sukhbaatar, S., Fergus, R., et al. (2016). Learning multia-
gent communication with backpropagation. Advances
in neural information processing systems, 29.
Tallerman, M. E. (2005). Language origins: Perspectives
on evolution. Oxford University Press.
Tan, M. (1993). Multi-Agent Reinforcement Learning: In-
dependent versus Cooperative Agents. In ICML.
Vanneste, A., Vanneste, S., Mets, K., De Schepper, T., Mer-
celis, S., Latr
´
e, S., and Hellinckx, P. (2022). An
analysis of discretization methods for communica-
tion learning with multi-agent reinforcement learning.
arXiv preprint arXiv:2204.05669.
Von Frisch, K. (1992). Decoding the language of the bee.
NOBEL LECTURES, page 76.
Wang, F., Xiang, X., Cheng, J., and Yuille, A. L. (2017).
Normface: L2 hypersphere embedding for face verifi-
cation. In Proceedings of the 25th ACM international
conference on Multimedia, pages 1041–1049.
Wang, Y. and Sartoretti, G. (2022). Fcmnet: Full com-
munication memory net for team-level cooperation in
multi-agent systems. In Proceedings of the 21st Inter-
national Conference on Autonomous Agents and Mul-
tiagent Systems, pages 1355–1363.
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., and
Wu, Y. (2021). The surprising effectiveness of ppo
in cooperative, multi-agent games. arXiv preprint
arXiv:2103.01955.
Zhai, A. and Wu, H. (2019). Classification is a strong base-
line for deep metric learning. In 30th British Machine
Vision Conference 2019, BMVC 2019, Cardiff, UK,
September 9-12, 2019, page 91. BMVA Press.
Zhang, K., Yang, Z., Liu, H., Zhang, T., and Basar, T.
(2018). Fully decentralized multi-agent reinforce-
ment learning with networked agents. In International
Conference on Machine Learning, pages 5872–5881.
PMLR.
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
312