David, A., Jensen, P. G., Larsen, K. G., Mikucionis, M., and
Taankvist, J. H. (2015). Uppaal Stratego. In TACAS,
pages 206–211. Springer.
Dr
¨
ager, K., Forejt, V., Kwiatkowska, M. Z., Parker, D.,
and Ujma, M. (2015). Permissive controller synthe-
sis for probabilistic systems. Log. Methods Comput.
Sci., 11(2).
Farazi, N. P., Zou, B., Ahamed, T., and Barua, L. (2021).
Deep reinforcement learning in transportation re-
search: A review. Transportation Research Interdisci-
plinary Perspectives, 11:100425.
Gleave, A., Dennis, M., Wild, C., Kant, N., Levine, S.,
and Russell, S. (2020). Adversarial policies: Attack-
ing deep reinforcement learning. In ICLR. OpenRe-
view.net.
Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Ex-
plaining and harnessing adversarial examples. In
ICLR.
Gross, D., Jansen, N., Junges, S., and P
´
erez, G. A. (2022a).
COOL-MC: A comprehensive tool for reinforcement
learning and model checking. In SETTA. Springer.
Gross, D., Sim
˜
ao, T. D., Jansen, N., and Perez, G. A.
(2022b). Targeted adversarial attacks on deep re-
inforcement learning policies via model checking.
CoRR, abs/2212.05337.
Hahn, E. M., Perez, M., Schewe, S., Somenzi, F., Trivedi,
A., and Wojtczak, D. (2019). Omega-regular objec-
tives in model-free reinforcement learning. In TACAS
(1), pages 395–412. Springer.
Hansson, H. and Jonsson, B. (1994). A logic for reasoning
about time and reliability. Formal Aspects Comput.,
6(5):512–535.
Hasanbeig, M., Kroening, D., and Abate, A. (2020). Deep
reinforcement learning with temporal logics. In FOR-
MATS, pages 1–22. Springer.
Hensel, C., Junges, S., Katoen, J., Quatmann, T., and Volk,
M. (2022). The probabilistic model checker Storm.
Int. J. Softw. Tools Technol. Transf., 24(4):589–610.
Huang, S. H., Papernot, N., Goodfellow, I. J., Duan, Y.,
and Abbeel, P. (2017). Adversarial attacks on neural
network policies. In ICLR. OpenReview.net.
Ilahi, I., Usama, M., Qadir, J., Janjua, M. U., Al-Fuqaha,
A. I., Hoang, D. T., and Niyato, D. (2022). Chal-
lenges and countermeasures for adversarial attacks on
deep reinforcement learning. IEEE Trans. Artif. In-
tell., 3(2):90–109.
Korkmaz, E. (2021a). Adversarial training blocks general-
ization in neural policies. In NeurIPS 2021 Workshop
on Distribution Shifts: Connecting Methods and Ap-
plications.
Korkmaz, E. (2021b). Investigating vulnerabilities of deep
neural policies. In UAI, pages 1661–1670. AUAI
Press.
Korkmaz, E. (2022). Deep reinforcement learning poli-
cies learn shared adversarial features across mdps. In
AAAI, pages 7229–7238. AAAI Press.
Kwiatkowska, M. Z., Norman, G., and Parker, D. (2011).
PRISM 4.0: Verification of probabilistic real-time sys-
tems. In CAV, pages 585–591. Springer.
Lee, X. Y., Esfandiari, Y., Tan, K. L., and Sarkar, S. (2021).
Query-based targeted action-space adversarial poli-
cies on deep reinforcement learning agents. In ICCPS,
pages 87–97. ACM.
Lee, X. Y., Ghadai, S., Tan, K. L., Hegde, C., and Sarkar, S.
(2020). Spatiotemporally constrained action space at-
tacks on deep reinforcement learning agents. In AAAI,
pages 4577–4584. AAAI Press.
Lin, Y., Hong, Z., Liao, Y., Shih, M., Liu, M., and Sun, M.
(2017a). Tactics of adversarial attack on deep rein-
forcement learning agents. In ICLR. OpenReview.net.
Lin, Y., Liu, M., Sun, M., and Huang, J. (2017b). Detect-
ing adversarial attacks on neural network policies with
visual foresight. CoRR, abs/1710.00814.
Liu, Z., Guo, Z., Cen, Z., Zhang, H., Tan, J., Li, B., and
Zhao, D. (2022). On the robustness of safe rein-
forcement learning under observational perturbations.
CoRR, abs/2205.14691.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M. A.
(2013). Playing atari with deep reinforcement learn-
ing. CoRR, abs/1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller,
M. A., Fidjeland, A., Ostrovski, G., Petersen, S.,
Beattie, C., Sadik, A., Antonoglou, I., King, H., Ku-
maran, D., Wierstra, D., Legg, S., and Hassabis, D.
(2015). Human-level control through deep reinforce-
ment learning. Nat., 518(7540):529–533.
Nakabi, T. A. and Toivanen, P. (2021). Deep reinforcement
learning for energy management in a microgrid with
flexible demand. Sustainable Energy, Grids and Net-
works, 25:100413.
Pinto, L., Davidson, J., Sukthankar, R., and Gupta, A.
(2017). Robust adversarial reinforcement learning. In
ICML, pages 2817–2826. PMLR.
Rakhsha, A., Radanovic, G., Devidze, R., Zhu, X., and
Singla, A. (2020). Policy teaching via environment
poisoning: Training-time adversarial attacks against
reinforcement learning. In ICML, pages 7974–7984.
PMLR.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learn-
ing: An introduction. MIT press.
Vamplew, P., Smith, B. J., K
¨
allstr
¨
om, J., de Oliveira Ramos,
G., Radulescu, R., Roijers, D. M., Hayes, C. F.,
Heintz, F., Mannion, P., Libin, P. J. K., Dazeley, R.,
and Foale, C. (2022). Scalar reward is not enough:
a response to silver, singh, precup and sutton (2021).
Auton. Agents Multi Agent Syst., 36(2):41.
Yu, M. and Sun, S. (2022). Natural black-box adversar-
ial examples against deep reinforcement learning. In
AAAI, pages 8936–8944. AAAI Press.
Zhang, H., Chen, H., Xiao, C., Li, B., Liu, M., Bon-
ing, D. S., and Hsieh, C. (2020). Robust deep rein-
forcement learning against adversarial perturbations
on state observations. In NeurIPS, pages 21024–
21037.
ICAART 2023 - 15th International Conference on Agents and Artificial Intelligence
508