Adversarial Robustness Assessment: Why both $L_0$ and $L_\infty$ Attacks Are Necessary

TOP 文献データベース Adversarial Robustness Assessment: Why both $L_0$ and $L_\infty$ Attacks Are Necessary

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1906.06026

PDF

https://arxiv.org/pdf/1906.06026

文献情報

作者: Shashank Kotyan,Danilo Vasconcellos Vargas
公開日: 2019-6-14
更新日: 2020-7-16
所属機関: Laboratory of Intelligent Systems, Department of Informatics, Kyushu University
所属の国: Japan
会議名

AIにより推定されたラベル

敵対的サンプル敵対的学習防御手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

There exists a vast number of adversarial attacks and defences for machine learning algorithms of various types which makes assessing the robustness of algorithms a daunting task. To make matters worse, there is an intrinsic bias in these adversarial algorithms. Here, we organise the problems faced: a) Model Dependence, b) Insufficient Evaluation, c) False Adversarial Samples, and d) Perturbation Dependent Results). Based on this, we propose a model agnostic dual quality assessment method, together with the concept of robustness levels to tackle them. We validate the dual quality assessment on state-of-the-art neural networks (WideResNet, ResNet, AllConv, DenseNet, NIN, LeNet and CapsNet) as well as adversarial defences for image classification problem. We further show that current networks and defences are vulnerable at all levels of robustness. The proposed robustness assessment reveals that depending on the metric used (i.e., $L_0$ or $L_\infty$), the robustness may vary significantly. Hence, the duality should be taken into account for a correct evaluation. Moreover, a mathematical derivation, as well as a counter-example, suggest that $L_1$ and $L_2$ metrics alone are not sufficient to avoid spurious adversarial samples. Interestingly, the threshold attack of the proposed assessment is a novel $L_\infty$ black-box adversarial method which requires even less perturbation than the One-Pixel Attack (only $12\%$ of One-Pixel Attack's amount of perturbation) to achieve similar results. Code is available at http://bit.ly/DualQualityAssessment.