
the transfer learning approach, where agents will be
initialized with prior knowledge of optimal bidding
strategies derived from smaller player auctions. This
initialization will provide a better starting point than
random initialization, as the agents will inherit the
learned weights from previous training. An important
direction for future research is the application of this
methodology to scoring auctions, which have signifi-
cant practical implications, particularly in the Brazil-
ian context. For example, scoring auctions have been
used in Brazil to allocate oil exploration rights, as de-
tailed in (Sant’Anna, 2017). In these auctions, bid-
ders submit multidimensional bids, including a mon-
etary bonus and an exploratory program, with a non-
linear scoring rule determining the winner. This for-
mat introduces unique challenges and opportunities
for modeling and evaluation, as estimating the dis-
tribution of primitive variables—such as tract values
and exploration commitment costs—enables counter-
factual analysis of revenue under alternative bidding
schemes. By adapting our tool to this context, we
aim to explore its ability to handle the complexities of
multidimensional scoring rules and assess its utility in
evaluating and optimizing such auction mechanisms.
Addressing these complexities will be crucial to ad-
vancing our understanding of multi-agent dynamics
and improving auction design.
REFERENCES
Bichler, M., Fichtl, M., Heidekr
¨
uger, S., Kohring, N., and
Sutterer, P. (2021). Learning equilibria in symmetric
auction games using artificial neural networks. Nature
machine intelligence, 3(8):687–695.
Dechenaux, E., Kovenock, D., and Sheremeta, R. M.
(2015). A survey of experimental research on con-
tests, all-pay auctions and tournaments. Experimental
Economics, 18:609–669.
Dragoni, N. and Gaspari, M. (2012). Declarative speci-
fication of fault tolerant auction protocols: The en-
glish auction case study. Computational Intelligence,
28(4):617–641.
D
¨
utting, P., Feng, Z., Narasimhan, H., Parkes, D. C.,
and Ravindranath, S. S. (2021). Optimal auctions
through deep learning. Communications of the ACM,
64(8):109–116.
Ewert, M., Heidekr
¨
uger, S., and Bichler, M. (2022). Ap-
proaching the overbidding puzzle in all-pay auctions:
Explaining human behavior through bayesian opti-
mization and equilibrium learning. In Proceedings
of the 21st International Conference on Autonomous
Agents and Multiagent Systems, pages 1586–1588.
Frahm, D. G. and Schrader, L. F. (1970). An experimental
comparison of pricing in two auction systems. Amer-
ican Journal of Agricultural Economics, 52(4):528–
534.
Gemp, I., Anthony, T., Kramar, J., Eccles, T., Tacchetti, A.,
and Bachrach, Y. (2022). Designing all-pay auctions
using deep learning and multi-agent simulation. Sci-
entific Reports, 12(1):16937.
Kannan, K. N., Pamuru, V., and Rosokha, Y. (2019). Us-
ing machine learning for modeling human behavior
and analyzing friction in generalized second price auc-
tions. Available at SSRN 3315772.
Klemperer, P. (1999). Auction theory: A guide to the liter-
ature. Journal of economic surveys, 13(3):227–286.
Krishna, V. (2009). Auction theory. Academic press.
Luong, N. C., Xiong, Z., Wang, P., and Niyato, D. (2018).
Optimal auction for edge computing resource man-
agement in mobile blockchain networks: A deep
learning approach. In 2018 IEEE international con-
ference on communications (ICC), pages 1–6. IEEE.
Menezes, F. and Monteiro, P. (2008). An introduction to
auction theory: Oxford university press.
Mnih, V. (2016). Asynchronous methods for deep rein-
forcement learning. arXiv preprint arXiv:1602.01783.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller, M.,
Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-
level control through deep reinforcement learning. na-
ture, 518(7540):529–533.
Noussair, C. and Silver, J. (2006). Behavior in all-pay auc-
tions with incomplete information. Games and Eco-
nomic Behavior, 55(1):189–206.
Riley, J. G. and Samuelson, W. F. (1981). Optimal auctions.
The American Economic Review, 71(3):381–392.
Sant’Anna, M. C. B. (2017). Empirical analysis of scoring
auctions for oil and gas leases.
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K.,
Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hass-
abis, D., Graepel, T., et al. (2020). Mastering atari,
go, chess and shogi by planning with a learned model.
Nature, 588(7839):604–609.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and
Klimov, O. (2017). Proximal policy optimization al-
gorithms. arXiv preprint arXiv:1707.06347.
Shoham, Y. and Leyton-Brown, K. (2008). Multiagent sys-
tems: Algorithmic, game-theoretic, and logical foun-
dations. Cambridge University Press.
Sutton, R. S. (2018). Reinforcement learning: An introduc-
tion. A Bradford Book.
Zheng, S. and Liu, H. (2019). Improved multi-agent deep
deterministic policy gradient for path planning-based
crowd simulation. Ieee Access, 7:147755–147770.
ICAART 2025 - 17th International Conference on Agents and Artificial Intelligence
376