Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,
Schulman, J., Tang, J., and Zaremba, W. (2016). Ope-
nAI Gym. arXiv 1606.01540.
Buddareddygari, P., Zhang, T., Yang, Y., and Ren, Y.
(2022). Targeted Attack on Deep RL-based Au-
tonomous Driving with Learned Visual Patterns. In
2022 International Conference on Robotics and Au-
tomation (ICRA), pages 10571–10577.
Cao, Y., Xiao, C., Cyr, B., Zhou, Y., Park, W., Rampazzi,
S., Chen, Q. A., Fu, K., and Mao, Z. M. (2019). Ad-
versarial Sensor Attack on LiDAR-Based Perception
in Autonomous Driving. In Proceedings of the 2019
ACM SIGSAC Conference on Computer and Commu-
nications Security, CCS ’19, page 2267–2281, New
York, NY, USA. Association for Computing Machin-
ery.
Carlini, N. and Wagner, D. (2017). Towards Evaluating the
Robustness of Neural Networks. In 2017 IEEE Sym-
posium on Security and Privacy (SP), pages 39–57.
Chebotar, Y., Hausman, K., Lu, Y., Xiao, T., Kalashnikov,
D., Varley, J., Irpan, A., Eysenbach, B., Julian, R.,
Finn, C., and Levine, S. (2021). Actionable Mod-
els: Unsupervised Offline Reinforcement Learning of
Robotic Skills. arXiv 2104.07749.
Clarysse, J., H
¨
orrmann, J., and Yang, F. (2022). Why
adversarial training can hurt robust accuracy. arXiv
2203.02006.
Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Ex-
plaining and Harnessing Adversarial Examples. arXiv
1412.6572.
He, X., Yang, H., Hu, Z., and Lv, C. (2023). Robust Lane
Change Decision Making for Autonomous Vehicles:
An Observation Adversarial Reinforcement Learning
Approach. IEEE Transactions on Intelligent Vehicles,
8(1):184–193.
Huang, S., Papernot, N., Goodfellow, I., Duan, Y., and
Abbeel, P. (2017). Adversarial Attacks on Neural Net-
work Policies. arXiv 1702.02284.
Isele, D., Nakhaei, A., and Fujimura, K. (2018). Safe Re-
inforcement Learning on Autonomous Vehicles. In
2018 IEEE/RSJ International Conference on Intelli-
gent Robots and Systems (IROS), pages 1–6.
Kalashnikov, D., Varley, J., Chebotar, Y., Swanson, B., Jon-
schkowski, R., Finn, C., Levine, S., and Hausman, K.
(2021). MT-Opt: Continuous Multi-Task Robotic Re-
inforcement Learning at Scale. arXiv 2104.08212.
Kingma, D. P. and Ba, J. (2017). Adam: A Method for
Stochastic Optimization. arXiv 1412.6980.
Leurent, E. (2018). An Environment for Autonomous Driv-
ing Decision-Making. GitHub.
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T.,
Tassa, Y., Silver, D., and Wierstra, D. (2019). Contin-
uous control with deep reinforcement learning. arXiv
1509.02971.
Liu, S., Chen, P.-Y., Chen, X., and Hong, M. (2019).
signSGD via Zeroth-Order Oracle. In International
Conference on Learning Representations.
Liu, S., Chen, P.-Y., Kailkhura, B., Zhang, G., Hero III,
A. O., and Varshney, P. K. (2020). A Primer on
Zeroth-Order Optimization in Signal Processing and
Machine Learning: Principals, Recent Advances, and
Applications. IEEE Signal Processing Magazine,
37(5):43–54.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M.
(2013). Playing Atari with Deep Reinforcement
Learning. arXiv 1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller,
M., Fidjeland, A. K., Ostrovski, G., Petersen, S.,
Beattie, C., Sadik, A., Antonoglou, I., King, H., Ku-
maran, D., Wierstra, D., Legg, S., and Hassabis, D.
(2015). Human-level control through deep reinforce-
ment learning. Nature, 518(7540):529–533.
Moosavi-Dezfooli, S.-M., Fawzi, A., and Frossard, P.
(2016). Deepfool: a simple and accurate method to
fool deep neural networks. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 2574–2582.
Papernot, N., McDaniel, P., Wu, X., Jha, S., and Swami,
A. (2016). Distillation as a Defense to Adversarial
Perturbations against Deep Neural Networks. arXiv
1511.04508.
Pattanaik, A., Tang, Z., Liu, S., Bommannan, G., and
Chowdhary, G. (2017). Robust Deep Reinforce-
ment Learning with Adversarial Attacks. arXiv
1712.03632.
Powell, M. J. D. (1994). A Direct Search Optimiza-
tion Method That Models the Objective and Con-
straint Functions by Linear Interpolation, pages 51–
67. Springer Netherlands, Dordrecht.
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus,
M., and Dormann, N. (2021). Stable-Baselines3: Reli-
able Reinforcement Learning Implementations. Jour-
nal of Machine Learning Research, 22(268):1–8.
Raghunathan, A., Xie, S. M., Yang, F., Duchi, J. C., and
Liang, P. (2019). Adversarial Training Can Hurt Gen-
eralization. arXiv 1906.06032.
Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., and
de Freitas, N. (2016). Taking the Human Out of the
Loop: A Review of Bayesian Optimization. Proceed-
ings of the IEEE, 104(1):148–175.
Sinha, A., Namkoong, H., Volpi, R., and Duchi, J. (2020).
Certifying Some Distributional Robustness with Prin-
cipled Adversarial Training. arXiv 1710.10571.
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan,
D., Goodfellow, I., and Fergus, R. (2014). Intriguing
properties of neural networks. arXiv 1312.6199.
Zeroth-Order Optimization Attacks on Deep Reinforcement Learning-Based Lane Changing Algorithms for Autonomous Vehicles
673