Early Methods for Detecting Adversarial Images

TOP 文献データベース Early Methods for Detecting Adversarial Images

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1608.00530

PDF

https://arxiv.org/pdf/1608.00530

文献情報

作者: Dan Hendrycks,Kevin Gimpel
公開日: 2016-8-2
更新日: 2017-3-24
所属機関: University of Chicago
所属の国: United States of America
会議名: International Conference on Learning Representations (ICLR)

AIにより推定されたラベル

敵対的訓練敵対的サンプル敵対的学習

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier's prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix.

外部データセット

Tiny-ImageNet

CIFAR-10

MNIST