Authors:
Myria Bouhaddi
and
Kamel Adi
Affiliation:
Computer Security Research Laboratory, University of Quebec in Outaouais, Gatineau, Quebec, Canada
Keyword(s):
Deep Reinforcement Learning, Adversarial Attacks, Reward Poisoning Attacks, Optimal Defense Policy, Multi-Environment Training.
Abstract:
Our research tackles the critical challenge of defending against poisoning attacks in deep reinforcement learning, which have significant cybersecurity implications. These attacks involve subtle manipulation of rewards, leading the attacker’s policy to appear optimal under the poisoned rewards, thus compromising the integrity and reliability of such systems. Our goal is to develop robust agents resistant to manipulations. We propose an optimization framework with a multi-environment setting, which enhances resilience and generalization. By exposing agents to diverse environments, we mitigate the impact of poisoning attacks. Additionally, we employ a variance-based method to detect reward manipulation effectively. Leveraging this information, our optimization framework derives a defense policy that fortifies agents against attacks, bolstering their resistance to reward manipulation.