Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions

TOP 文献データベース Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1907.02957

PDF

https://arxiv.org/pdf/1907.02957

文献情報

作者: Yao Qin,Nicholas Frosst,Sara Sabour,Colin Raffel,Garrison Cottrell,Geoffrey Hinton
公開日: 2019-7-6
更新日: 2020-2-18
所属機関: UC San Diego
所属の国: United States of America
会議名

AIにより推定されたラベル

敵対的サンプル敵対的攻撃深層学習手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Adversarial examples raise questions about whether neural network models are sensitive to the same visual features as humans. In this paper, we first detect adversarial examples or otherwise corrupted images based on a class-conditional reconstruction of the input. To specifically attack our detection mechanism, we propose the Reconstructive Attack which seeks both to cause a misclassification and a low reconstruction error. This reconstructive attack produces undetected adversarial examples but with much smaller success rate. Among all these attacks, we find that CapsNets always perform better than convolutional networks. Then, we diagnose the adversarial examples for CapsNets and find that the success of the reconstructive attack is highly related to the visual similarity between the source and target class. Additionally, the resulting perturbations can cause the input image to appear visually more like the target class and hence become non-adversarial. This suggests that CapsNets use features that are more aligned with human perception and have the potential to address the central issue raised by adversarial examples.