Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

TOP 文献データベース Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1703.06748

PDF

https://arxiv.org/pdf/1703.06748

文献情報

作者: Yen-Chen Lin,Zhang-Wei Hong,Yuan-Hong Liao,Meng-Li Shih,Ming-Yu Liu,Min Sun
公開日: 2017-3-8
更新日: 2019-11-13
所属機関: National Tsing Hua University
所属の国: Taiwan
会議名: International Conference on Learning Representations (ICLR)

AIにより推定されたラベル

敵対的サンプル攻撃パターン抽出防御メカニズム

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at http://yenchenlin.me/adversarial_attack_RL/