Denoised Smoothing: A Provable Defense for Pretrained Classifiers

TOP 文献データベース Denoised Smoothing: A Provable Defense for Pretrained Classifiers

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2003.01908

PDF

https://arxiv.org/pdf/2003.01908

文献情報

作者: Hadi Salman,Mingjie Sun,Greg Yang,Ashish Kapoor,J. Zico Kolter
公開日: 2020-3-4
更新日: 2020-9-21
所属機関: Microsoft Research
所属の国: United States of America
会議名: Conference on Neural Information Processing Systems (NeurIPS)

AIにより推定されたラベル

ロバスト性評価防御手法トレーニング手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. This method, for instance, allows public vision API providers and users to seamlessly convert pretrained non-robust classification services into provably robust ones. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be $\ell_p$-robust to adversarial examples, without modifying the pretrained classifier. Our approach applies to both the white-box and the black-box settings of the pretrained classifier. We refer to this defense as denoised smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our approach to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at: https://github.com/microsoft/denoised-smoothing.

外部データセット

ImageNet

CIFAR-10