Attacking the Madry Defense Model with $L_1$-based Adversarial Examples

Authors: Yash Sharma, Pin-Yu Chen | Published: 2017-10-30 | Updated: 2018-07-27

2017.10.302025.04.03

Authors: Yash Sharma, Pin-Yu Chen
Published: 2017-10-30 | Updated: 2018-07-27

Source: https://arxiv.org/abs/1710.10733

PDF: https://arxiv.org/pdf/1710.10733

AIにより推定されたラベル

モデルの頑健性保証敵対的サンプルの検知ロバスト性向上手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal L_∞ distortion ϵ = 0.3. This discourages the use of attacks which are not optimized on the L_∞ distortion metric. Our experimental results demonstrate that by relaxing the L_∞ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average L_∞ distortion, have minimal visual distortion. These results call into question the use of L_∞ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.