An Analysis of Robustness of Non-Lipschitz Networks

TOP 文献データベース An Analysis of Robustness of Non-Lipschitz Networks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2010.06154

PDF

https://arxiv.org/pdf/2010.06154

文献情報

作者: Maria-Florina Balcan;Avrim Blum;Dravyansh Sharma;Hongyang Zhang
公開日: 2020-10-13
更新日: 2023-4-19
所属機関: Carnegie Mellon University
所属の国: United States of America
会議名: J. Mach. Learn. Res.

AIにより推定されたラベル

敵対的訓練攻撃手法損失関数

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Despite significant advances, deep networks remain highly susceptible to adversarial attack. One fundamental challenge is that small input perturbations can often produce large movements in the network's final-layer feature space. In this paper, we define an attack model that abstracts this challenge, to help understand its intrinsic properties. In our model, the adversary may move data an arbitrary distance in feature space but only in random low-dimensional subspaces. We prove such adversaries can be quite powerful: defeating any algorithm that must classify any input it is given. However, by allowing the algorithm to abstain on unusual inputs, we show such adversaries can be overcome when classes are reasonably well-separated in feature space. We further provide strong theoretical guarantees for setting algorithm parameters to optimize over accuracy-abstention trade-offs using data-driven methods. Our results provide new robustness guarantees for nearest-neighbor style algorithms, and also have application to contrastive learning, where we empirically demonstrate the ability of such algorithms to obtain high robust accuracy with low abstention rates. Our model is also motivated by strategic classification, where entities being classified aim to manipulate their observable features to produce a preferred classification, and we provide new insights into that area as well.

外部データセット

CIFAR10