Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

TOP 文献データベース Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

International Conference on Machine Learning (ICML)

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1907.02044

PDF

https://arxiv.org/pdf/1907.02044

文献情報

作者: Francesco Croce,Matthias Hein
公開日: 2019-7-4
更新日: 2020-7-21
所属機関: University of Tübingen
所属の国: Germany
会議名: International Conference on Machine Learning (ICML)

AIにより推定されたラベル

敵対的攻撃ポイズニング敵対的サンプルの脆弱性

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient masking.