Towards Certifiable Adversarial Sample Detection

TOP 文献データベース Towards Certifiable Adversarial Sample Detection

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2002.08740

PDF

https://arxiv.org/pdf/2002.08740

文献情報

作者: Ilia Shumailov,Yiren Zhao,Robert Mullins,Ross Anderson
公開日: 2020-2-20
所属機関: University of Cambridge
所属の国: United Kingdom
会議名: AISec@CCS

AIにより推定されたラベル

防御手法敵対的サンプル正則化

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or other limitations. In this paper, we provide a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). The system can provide certifiable guarantees of detection of adversarial inputs for certain $l_{\infty}$ sizes on a reasonable assumption, namely that the training data have the same distribution as the test data. We develop and evaluate several versions of CTT with a range of defense capabilities, training overheads and certifiability on adversarial samples. Against adversaries with various $l_p$ norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.