multi-environment setting, randomly selecting envi-
ronments for agent interaction. By averaging rewards
across multiple environments, our approach effec-
tively mitigates the impact of poisoning and enhances
agent robustness. Our method ensures the preserva-
tion of true reward performance while providing prov-
able guarantees for defense policy effectiveness, en-
suring safety and reliability in critical applications.
This contribution represents a significant advance-
ment in the development of robust and secure deep
reinforcement learning systems for real-world scenar-
ios. Future goals include conducting experiments to
compare our approach with existing defenses, vali-
dating its effectiveness and practicality, and leverag-
ing the insights gained to further enhance our defense
mechanism.
REFERENCES
Akhtar, N. and Mian, A. (2018). Threat of adversarial at-
tacks on deep learning in computer vision: A survey.
Ieee Access, 6:14410–14430.
Banihashem, K., Singla, A., and Radanovic, G. (2021). De-
fense against reward poisoning attacks in reinforce-
ment learning. arXiv preprint arXiv:2102.05776.
Behzadan, V. and Munir, A. (2017a). Vulnerability of deep
reinforcement learning to policy induction attacks. In
Machine Learning and Data Mining in Pattern Recog-
nition: 13th International Conference, MLDM 2017,
New York, NY, USA, July 15-20, 2017, Proceedings
13, pages 262–275. Springer.
Behzadan, V. and Munir, A. (2017b). Whatever does not kill
deep reinforcement learning, makes it stronger. arXiv
preprint arXiv:1712.09344.
Behzadan, V. and Munir, A. (2018). Mitigation of pol-
icy manipulation attacks on deep q-networks with
parameter-space noise. In Computer Safety, Reliabil-
ity, and Security: SAFECOMP 2018, V
¨
aster
˚
as, Swe-
den, September 18, 2018, Proceedings 37, pages 406–
417. Springer.
Bouhaddi, M., Radjef, M. S., and Adi, K. (2018). An effi-
cient intrusion detection in resource-constrained mo-
bile ad-hoc networks. Computers & Security, 76:156–
177.
Chen, Y., Du, S., and Jamieson, K. (2021). Improved cor-
ruption robust algorithms for episodic reinforcement
learning. In International Conference on Machine
Learning, pages 1561–1570. PMLR.
Greydanus, S., Koul, A., Dodge, J., and Fern, A. (2018). Vi-
sualizing and understanding atari agents. In Interna-
tional conference on machine learning, pages 1792–
1801. PMLR.
Huang, S., Papernot, N., Goodfellow, I., Duan, Y., and
Abbeel, P. (2017). Adversarial attacks on neural net-
work policies. arXiv preprint arXiv:1702.02284.
Inkawhich, M., Chen, Y., and Li, H. (2019). Snooping at-
tacks on deep reinforcement learning. arXiv preprint
arXiv:1905.11832.
Kiran, B. R., Sobh, I., Talpaert, V., Mannion, P., Al Sallab,
A. A., Yogamani, S., and P
´
erez, P. (2021). Deep rein-
forcement learning for autonomous driving: A survey.
IEEE Transactions on Intelligent Transportation Sys-
tems, 23(6):4909–4926.
Kos, J. and Song, D. (2017). Delving into adver-
sarial attacks on deep policies. arXiv preprint
arXiv:1705.06452.
Lin, Y.-C., Hong, Z.-W., Liao, Y.-H., Shih, M.-L., Liu, M.-
Y., and Sun, M. (2017). Tactics of adversarial attack
on deep reinforcement learning agents. arXiv preprint
arXiv:1703.06748.
Lykouris, T., Simchowitz, M., Slivkins, A., and Sun, W.
(2021). Corruption-robust exploration in episodic re-
inforcement learning. In Conference on Learning The-
ory, pages 3242–3245. PMLR.
Ma, Y., Zhang, X., Sun, W., and Zhu, J. (2019). Policy
poisoning in batch reinforcement learning and control.
Advances in Neural Information Processing Systems,
32.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller, M.,
Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-
level control through deep reinforcement learning. na-
ture, 518(7540):529–533.
Rakhsha, A., Radanovic, G., Devidze, R., Zhu, X., and
Singla, A. (2020). Policy teaching via environment
poisoning: Training-time adversarial attacks against
reinforcement learning. In International Conference
on Machine Learning, pages 7974–7984. PMLR.
Schulman, J., Levine, S., Abbeel, P., Jordan, M., and
Moritz, P. (2015). Trust region policy optimization. In
International conference on machine learning, pages
1889–1897. PMLR.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learn-
ing: An introduction. MIT press.
Wang, J., Liu, Y., and Li, B. (2020). Reinforcement learning
with perturbed rewards. In Proceedings of the AAAI
conference on artificial intelligence, volume 34, pages
6202–6209.
Wei, C.-Y., Dann, C., and Zimmert, J. (2022). A model se-
lection approach for corruption robust reinforcement
learning. In International Conference on Algorithmic
Learning Theory, pages 1043–1096. PMLR.
Wu, F., Li, L., Xu, C., Zhang, H., Kailkhura, B., Ken-
thapadi, K., Zhao, D., and Li, B. (2022). Copa:
Certifying robust policies for offline reinforcement
learning against poisoning attacks. arXiv preprint
arXiv:2203.08398.
Zhang, H., Chen, H., Boning, D., and Hsieh, C.-J. (2021a).
Robust reinforcement learning on state observations
with learned optimal adversary. arXiv preprint
arXiv:2101.08452.
Zhang, X., Ma, Y., Singla, A., and Zhu, X. (2020). Adaptive
reward-poisoning attacks against reinforcement learn-
ing. In International Conference on Machine Learn-
ing, pages 11225–11234. PMLR.
Zhang, Z., Lim, B., and Zohren, S. (2021b). Deep learn-
ing for market by order data. Applied Mathematical
Finance, 28(1):79–95.
Multi-Environment Training Against Reward Poisoning Attacks on Deep Reinforcement Learning
875