Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

TOP 文献データベース Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1903.10484

PDF

https://arxiv.org/pdf/1903.10484

文献情報

作者: Jörn-Henrik Jacobsen,Jens Behrmannn,Nicholas Carlini,Florian Tramèr,Nicolas Papernot
公開日: 2019-3-26
所属機関: Vector Institute
所属の国: Canada
会議名

AIにより推定されたラベル

敵対的攻撃検出モデルの頑健性保証敵対的サンプルの脆弱性

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Adversarial examples are malicious inputs crafted to cause a model to misclassify them. Their most common instantiation, "perturbation-based" adversarial examples introduce changes to the input that leave its true label unchanged, yet result in a different model prediction. Conversely, "invariance-based" adversarial examples insert changes to the input that leave the model's prediction unaffected despite the underlying input's label having changed. In this paper, we demonstrate that robustness to perturbation-based adversarial examples is not only insufficient for general robustness, but worse, it can also increase vulnerability of the model to invariance-based adversarial examples. In addition to analytical constructions, we empirically study vision classifiers with state-of-the-art robustness to perturbation-based adversaries constrained by an $\ell_p$ norm. We mount attacks that exploit excessive model invariance in directions relevant to the task, which are able to find adversarial examples within the $\ell_p$ ball. In fact, we find that classifiers trained to be $\ell_p$-norm robust are more vulnerable to invariance-based adversarial examples than their undefended counterparts. Excessive invariance is not limited to models trained to be robust to perturbation-based $\ell_p$-norm adversaries. In fact, we argue that the term adversarial example is used to capture a series of model limitations, some of which may not have been discovered yet. Accordingly, we call for a set of precise definitions that taxonomize and address each of these shortcomings in learning.