
(2022). Quantum deep reinforcement learning for
robot navigation tasks.
Helle, P., Feo-Arenis, S., Shortt, K., and Strobel, C.
(2022a). Decentralized collaborative decision-making
for topology building in mobile ad-hoc networks. In
2022 Thirteenth International Conference on Ubiqui-
tous and Future Networks (ICUFN), pages 233–238.
Helle, P., Feo-Arenis, S., Strobel, C., and Shortt, K.
(2022b). Agent-based modelling and simulation of
decision-making in flying ad-hoc networks. In Ad-
vances in Practical Applications of Agents, Multi-
Agent Systems, and Complex Systems Simulation. The
PAAMS Collection, pages 242–253. Springer Interna-
tional Publishing.
Hu, S., Zhong, Y., Gao, M., Wang, W., Dong, H., Liang,
X., Li, Z., Chang, X., and Yang, Y. (2023). Marllib: A
scalable and efficient multi-agent reinforcement learn-
ing library.
Khan, M. F., Yau, K.-L. A., Noor, R. M., and Imran, M. A.
(2020). Routing schemes in fanets: A survey. Sensors,
20(1).
Kim, T., Lee, S., Kim, K. H., and Jo, Y.-I. (2023). Fanet
routing protocol analysis for multi-uav-based recon-
naissance mobility models. Drones, 7(3).
Kingma, D. P. and Ba, J. (2017). Adam: A method for
stochastic optimization.
K
¨
olle, M., Topp, F., Phan, T., Altmann, P., N
¨
ußlein, J., and
Linnhoff-Popien, C. (2024). Multi-agent quantum re-
inforcement learning using evolutionary optimization.
Li, J., Lin, S., Yu, K., and Guo, G. (2022). Quantum
k-nearest neighbor classification algorithm based on
hamming distance. Quantum Information Processing,
21(1):18.
Lloyd, S., Mohseni, M., and Rebentrost, P. (2013). Quan-
tum algorithms for supervised and unsupervised ma-
chine learning.
McClean, J. R., Boixo, S., Smelyanskiy, V. N., Babbush,
R., and Neven, H. (2018). Barren plateaus in quantum
neural network training landscapes. Nature communi-
cations, 9(1):4812.
Meyer, N., Ufrecht, C., Periyasamy, M., Scherer, D. D.,
Plinge, A., and Mutschler, C. (2024). A survey on
quantum reinforcement learning.
Moon, W., Park, B., Nengroo, S. H., Kim, T., and Har,
D. (2022). Path planning of cleaning robot with re-
inforcement learning.
Oliehoek, F. A., Amato, C., et al. (2016). A concise
introduction to decentralized POMDPs, volume 1.
Springer.
OpenAI, Berner, C., Brockman, G., Chan, B., Cheung,
V., Debiak, P., Dennison, C., Farhi, D., Fischer, Q.,
Hashme, S., Hesse, C., Jozefowicz, R., Gray, S., Ols-
son, C., Pachocki, J., Petrov, M., d. O. Pinto, H. P.,
Raiman, J., Salimans, T., Schlatter, J., Schneider, J.,
Sidor, S., Sutskever, I., Tang, J., Wolski, F., and
Zhang, S. (2019). Dota 2 with large scale deep re-
inforcement learning.
Park, C., Yun, W. J., Kim, J. P., Rodrigues, T. K., Park, S.,
Jung, S., and Kim, J. (2023a). Quantum multi-agent
actor-critic networks for cooperative mobile access in
multi-uav systems.
Park, S., Kim, J. P., Park, C., Jung, S., and Kim, J. (2023b).
Quantum multi-agent reinforcement learning for au-
tonomous mobility cooperation.
Preskill, J. (2018). Quantum computing in the nisq era and
beyond. Quantum, 2:79.
Qie, H., Shi, D., Shen, T., Xu, X., Li, Y., and Wang, L.
(2019). Joint optimization of multi-uav target assign-
ment and path planning based on multi-agent rein-
forcement learning. IEEE access, 7:146264–146272.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and
Klimov, O. (2017). Proximal policy optimization al-
gorithms.
Sim, S., Johnson, P. D., and Aspuru-Guzik, A. (2019). Ex-
pressibility and entangling capability of parameter-
ized quantum circuits for hybrid quantum-classical al-
gorithms. Advanced Quantum Technologies, 2(12).
Skolik, A., Mangini, S., B
¨
ack, T., Macchiavello, C., and
Dunjko, V. (2023). Robustness of quantum reinforce-
ment learning under hardware errors. EPJ Quantum
Technology, 10(1):1–43.
Spall, J. C. (1998). An overview of the simultaneous pertur-
bation method for efficient optimization. Johns Hop-
kins apl technical digest, 19(4):482–492.
Ullah, U., Jurado, A. G. O., Gonzalez, I. D., and Garcia-
Zapirain, B. (2022). A fully connected quantum
convolutional neural network for classifying ischemic
cardiopathy. IEEE Access, 10:134592–134605.
Wiebe, N., Kapoor, A., and Svore, K. (2014). Quantum al-
gorithms for nearest-neighbor methods for supervised
and unsupervised learning.
Yu, C., Velu, A., Vinitsky, E., Gao, J., Wang, Y., Bayen,
A., and Wu, Y. (2022). The surprising effectiveness of
ppo in cooperative, multi-agent games.
Yun, W. J., Kim, J. P., Jung, S., Kim, J.-H., and Kim, J.
(2023). Quantum multi-agent actor-critic neural net-
works for internet-connected multi-robot coordination
in smart factory management.
Quantum Multi-Agent Reinforcement Learning for Aerial Ad-Hoc Networks
741