Transferable Perturbations of Deep Feature Distributions

TOP 文献データベース Transferable Perturbations of Deep Feature Distributions

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2004.12519

PDF

https://arxiv.org/pdf/2004.12519

文献情報

作者: Nathan Inkawhich,Kevin J Liang,Lawrence Carin,Yiran Chen
公開日: 2020-4-27
所属機関: Department of Electrical and Computer Engineering, Duke University
所属の国: United States of America
会議名: International Conference on Learning Representations (ICLR)

AIにより推定されたラベル

敵対的攻撃手法深層学習技術マルチクラス分類

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models. Further, we place a priority on explainability and interpretability of the attacking process. Our methodology affords an analysis of how adversarial attacks change the intermediate feature distributions of CNNs, as well as a measure of layer-wise and class-wise feature distributional separability/entanglement. We also conceptualize a transition from task/data-specific to model-specific features within a CNN architecture that directly impacts the transferability of adversarial examples.