Labels Predicted by AI
Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.
Abstract
We propose a new perspective on adversarial attacks against deep reinforcement learning agents. Our main contribution is CopyCAT, a targeted attack able to consistently lure an agent into following an outsider’s policy. It is pre-computed, therefore fast inferred, and could thus be usable in a real-time scenario. We show its effectiveness on Atari 2600 games in the novel read-only setting. In this setting, the adversary cannot directly modify the agent’s state – its representation of the environment – but can only attack the agent’s observation – its perception of the environment. Directly modifying the agent’s state would require a write-access to the agent’s inner workings and we argue that this assumption is too strong in realistic settings.