Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

TOP 文献データベース Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1809.05165

PDF

https://arxiv.org/pdf/1809.05165

文献情報

作者: Siyue Wang,Xiao Wang,Pu Zhao,Wujie Wen,David Kaeli,Peter Chin,Xue Lin
公開日: 2018-9-14
所属機関: Northeastern University
所属の国: United States of America
会議名: International Conference on Computer Aided Design (ICCAD)

AIにより推定されたラベル

ロバスト性向上敵対的サンプルモデルの頑健性保証

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. This work provides a solution to hardening DNNs under adversarial attacks through defensive dropout. Besides using dropout during training for the best test accuracy, we propose to use dropout also at test time to achieve strong defense effects. We consider the problem of building robust DNNs as an attacker-defender two-player game, where the attacker and the defender know each others' strategies and try to optimize their own strategies towards an equilibrium. Based on the observations of the effect of test dropout rate on test accuracy and attack success rate, we propose a defensive dropout algorithm to determine an optimal test dropout rate given the neural network model and the attacker's strategy for generating adversarial examples.We also investigate the mechanism behind the outstanding defense effects achieved by the proposed defensive dropout. Comparing with stochastic activation pruning (SAP), another defense method through introducing randomness into the DNN model, we find that our defensive dropout achieves much larger variances of the gradients, which is the key for the improved defense effects (much lower attack success rate). For example, our defensive dropout can reduce the attack success rate from 100% to 13.89% under the currently strongest attack i.e., C&W attack on MNIST dataset.

外部データセット

MNIST

CIFAR-10