
Dral, P. O. (2020). Quantum Chemistry in the Age of
Machine Learning. The Journal of Physical Chem-
istry Letters, 11(6):2336–2347. Publisher: American
Chemical Society.
Farhi, E., Goldstone, J., and Gutmann, S. (2014).
A Quantum Approximate Optimization Algorithm.
arXiv:1411.4028 [quant-ph]. arXiv: 1411.4028.
Farhi, E. and Harrow, A. W. (2016). Quantum supremacy
through the quantum approximate optimization algo-
rithm.
Gymlibrary, F. F. (2022). Cart pole - gym documentation.
Heimann, D., Hohenfeld, H., Wiebe, F., and Kirchner, F.
(2022). Quantum deep reinforcement learning for
robot navigation tasks.
Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup,
D., and Meger, D. (2017). Deep reinforcement learn-
ing that matters.
Hsiao, J.-Y., Du, Y., Chiang, W.-Y., Hsieh, M.-H., and
Goan, H.-S. (2022). Unentangled quantum reinforce-
ment learning agents in the openai gym.
Jerbi, S., Cornelissen, A., Ozols, M., and Dunjko, V. (2022).
Quantum policy gradient algorithms.
Jerbi, S., Gyurik, C., Marshall, S. C., Briegel, H. J., and
Dunjko, V. (2021). Parametrized quantum policies for
reinforcement learning.
Kingma, D. P. and Ba, J. (2014). Adam: A method for
stochastic optimization.
Kober, J., Bagnell, J. A., and Peters, J. (2013). Reinforce-
ment learning in robotics: A survey. The International
Journal of Robotics Research, 32(11):1238–1274.
Konda, V. and Tsitsiklis, J. (1999). Actor-critic algo-
rithms. In Solla, S., Leen, T., and M
¨
uller, K., editors,
Advances in Neural Information Processing Systems,
volume 12. MIT Press.
Kwak, Y., Yun, W. J., Jung, S., Kim, J.-K., and Kim, J.
(2021). Introduction to quantum reinforcement learn-
ing: Theory and pennylane-based implementation.
K
¨
olle, M., Giovagnoli, A., Stein, J., Mansky, M. B., Hager,
J., and Linnhoff-Popien, C. (2023). Improving conver-
gence for quantum variational classifiers using weight
re-mapping.
Lan, Q. (2021). Variational quantum soft actor-critic.
Lockwood, O. and Si, M. (2020). Reinforcement learning
with quantum variational circuits.
Mari, A., Bromley, T. R., Izaac, J., Schuld, M., and Killo-
ran, N. (2020). Transfer learning in hybrid classical-
quantum neural networks. Quantum, 4:340.
McClean, J. R., Boixo, S., Smelyanskiy, V. N., Babbush,
R., and Neven, H. (2018). Barren plateaus in quantum
neural network training landscapes. Nature Commu-
nications, 9(1).
Meyer, N., Scherer, D. D., Plinge, A., Mutschler, C., and
Hartmann, M. J. (2022a). Quantum policy gradient
algorithm with optimized action decoding.
Meyer, N., Ufrecht, C., Periyasamy, M., Scherer, D. D.,
Plinge, A., and Mutschler, C. (2022b). A survey on
quantum reinforcement learning.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Harley,
T., Lillicrap, T. P., Silver, D., and Kavukcuoglu,
K. (2016). Asynchronous methods for deep rein-
forcement learning. In Proceedings of the 33rd In-
ternational Conference on International Conference
on Machine Learning - Volume 48, ICML’16, page
1928–1937. JMLR.org.
Nielsen, M. and Chuang, I. (2010). Quantum Computation
and Quantum Information: 10th Anniversary Edition.
Cambridge University Press.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,
Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,
Antiga, L., Desmaison, A., Kopf, A., Yang, E., De-
Vito, Z., Raison, M., Tejani, A., Chilamkurthy, S.,
Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019).
Pytorch: An imperative style, high-performance deep
learning library. In Advances in Neural Information
Processing Systems 32, pages 8024–8035. Curran As-
sociates, Inc.
P
´
erez-Salinas, A., Cervera-Lierta, A., Gil-Fuster, E., and
Latorre, J. I. (2020). Data re-uploading for a universal
quantum classifier. Quantum, 4:226.
Pirandola, S., Andersen, U. L., Banchi, L., Berta, M.,
Bunandar, D., Colbeck, R., Englund, D., Gehring, T.,
Lupo, C., Ottaviani, C., Pereira, J. L., Razavi, M.,
Shaari, J. S., Tomamichel, M., Usenko, V. C., Vallone,
G., Villoresi, P., and Wallden, P. (2020). Advances in
quantum cryptography. Advances in Optics and Pho-
tonics, 12(4):1012.
Preskill, J. (2018). Quantum computing in the NISQ era
and beyond. Quantum, 2:79.
Schuld, M. and Petruccione, F. (2018). Supervised Learn-
ing with Quantum Computers. Springer Publishing
Company, Incorporated, 1st edition.
Sequeira, A., Santos, L. P., and Barbosa, L. S. (2022). Pol-
icy gradients using variational quantum circuits.
Shor, P. W. (1997). Polynomial-time algorithms for prime
factorization and discrete logarithms on a quantum
computer. SIAM Journal on Computing, 26(5):1484–
1509.
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou,
I., Huang, A., Guez, A., Hubert, T., Baker,
L., Lai, M., Bolton, A., Chen, Y., Lillicrap,
T., Hui, F., Sifre, L., van den Driessche, G.,
Graepel, T., and Hassabis, D. (2017). Mas-
tering the game of Go without human knowl-
edge. Nature, 550(7676):354–359. Bandiera abtest:
a Cg type: Nature Research Journals Number:
7676 Primary atype: Research Publisher: Na-
ture Publishing Group Subject term: Computational
science;Computer science;Reward Subject term id:
computational-science;computer-science;reward.
Skolik, A., Jerbi, S., and Dunjko, V. (2022). Quantum
agents in the gym: a variational quantum algorithm
for deep q-learning. Quantum, 6:720.
You, Y., Pan, X., Wang, Z., and Lu, C. (2017). Virtual to
real reinforcement learning for autonomous driving.
CoRR, abs/1704.03952.
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
304