Enhancing Adversarial Defense by k-Winners-Take-All

TOP 文献データベース Enhancing Adversarial Defense by k-Winners-Take-All

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1905.10510

PDF

https://arxiv.org/pdf/1905.10510

文献情報

作者: Chang Xiao,Peilin Zhong,Changxi Zheng
公開日: 2019-5-25
更新日: 2019-10-29
所属機関: Columbia University
所属の国: United States of America
会議名: International Conference on Learning Representations (ICLR)

AIにより推定されたラベル

敵対的サンプルの脆弱性スパース性最適化機械学習手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

We propose a simple change to existing neural network structures for better defending against gradient-based adversarial attacks. Instead of using popular activation functions (such as ReLU), we advocate the use of k-Winners-Take-All (k-WTA) activation, a C0 discontinuous function that purposely invalidates the neural network model's gradient at densely distributed input data points. The proposed k-WTA activation can be readily used in nearly all existing networks and training methods with no significant overhead. Our proposal is theoretically rationalized. We analyze why the discontinuities in k-WTA networks can largely prevent gradient-based search of adversarial examples and why they at the same time remain innocuous to the network training. This understanding is also empirically backed. We test k-WTA activation on various network structures optimized by a training method, be it adversarial training or not. In all cases, the robustness of k-WTA networks outperforms that of traditional networks under white-box attacks.

外部データセット

CIFAR-10

SVHN

MNIST