How to compare adversarial robustness of classifiers from a global perspective

TOP 文献データベース How to compare adversarial robustness of classifiers from a global perspective

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2004.10882

PDF

https://arxiv.org/pdf/2004.10882

文献情報

作者: Niklas Risse,Christina Göpfert,Jan Philip Göpfert
公開日: 2020-4-23
更新日: 2020-10-16
所属機関: Bielefeld University
所属の国: Germany
会議名: International Conference on Artificial Neural Networks and Machine Learning (ICANN)

AIにより推定されたラベル

ポイズニングロバスト性分析評価手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Adversarial robustness of machine learning models has attracted considerable attention over recent years. Adversarial attacks undermine the reliability of and trust in machine learning models, but the construction of more robust models hinges on a rigorous understanding of adversarial robustness as a property of a given model. Point-wise measures for specific threat models are currently the most popular tool for comparing the robustness of classifiers and are used in most recent publications on adversarial robustness. In this work, we use recently proposed robustness curves to show that point-wise measures fail to capture important global properties that are essential to reliably compare the robustness of different classifiers. We introduce new ways in which robustness curves can be used to systematically uncover these properties and provide concrete recommendations for researchers and practitioners when assessing and comparing the robustness of trained models. Furthermore, we characterize scale as a way to distinguish small and large perturbations, and relate it to inherent properties of data sets, demonstrating that robustness thresholds must be chosen accordingly. We release code to reproduce all experiments presented in this paper, which includes a Python module to calculate robustness curves for arbitrary data sets and classifiers, supporting a number of frameworks, including TensorFlow, PyTorch and JAX.