A Bayes-Optimal View on Adversarial Examples

TOP 文献データベース A Bayes-Optimal View on Adversarial Examples

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2002.08859

PDF

https://arxiv.org/pdf/2002.08859

文献情報

作者: Eitan Richardson;Yair Weiss
公開日: 2020-2-21
更新日: 2021-3-17
所属機関: School of Computer Science and Engineering, The Hebrew University of Jerusalem
所属の国: Israel
会議名: J. Mach. Learn. Res.

AIにより推定されたラベル

敵対的サンプルロバスト性評価敵対的訓練

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Since the discovery of adversarial examples - the ability to fool modern CNN classifiers with tiny perturbations of the input, there has been much discussion whether they are a "bug" that is specific to current neural architectures and training methods or an inevitable "feature" of high dimensional geometry. In this paper, we argue for examining adversarial examples from the perspective of Bayes-Optimal classification. We construct realistic image datasets for which the Bayes-Optimal classifier can be efficiently computed and derive analytic conditions on the distributions under which these classifiers are provably robust against any adversarial attack even in high dimensions. Our results show that even when these "gold standard" optimal classifiers are robust, CNNs trained on the same datasets consistently learn a vulnerable classifier, indicating that adversarial examples are often an avoidable "bug". We further show that RBF SVMs trained on the same data consistently learn a robust classifier. The same trend is observed in experiments with real images in different datasets.