On the Connection Between Adversarial Robustness and Saliency Map Interpretability

TOP 文献データベース On the Connection Between Adversarial Robustness and Saliency Map Interpretability

International Conference on Machine Learning (ICML)

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1905.04172

PDF

https://arxiv.org/pdf/1905.04172

文献情報

作者: Christian Etmann,Sebastian Lunz,Peter Maass,Carola-Bibiane Schönlieb
公開日: 2019-5-10
所属機関: Center for Industrial Mathematics, University of Bremen
所属の国: Germany
会議名: International Conference on Machine Learning (ICML)

AIにより推定されたラベル

ロバスト推定解釈可能性の損失敵対的攻撃検出

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Recent studies on the adversarial vulnerability of neural networks have shown that models trained to be more robust to adversarial attacks exhibit more interpretable saliency maps than their non-robust counterparts. We aim to quantify this behavior by considering the alignment between input image and saliency map. We hypothesize that as the distance to the decision boundary grows,so does the alignment. This connection is strictly true in the case of linear models. We confirm these theoretical findings with experiments based on models trained with a local Lipschitz regularization and identify where the non-linear nature of neural networks weakens the relation.