Defending Neural Backdoors via Generative Distribution Modeling

TOP 文献データベース Defending Neural Backdoors via Generative Distribution Modeling

Conference on Neural Information Processing Systems (NeurIPS)

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1910.04749

PDF

https://arxiv.org/pdf/1910.04749

文献情報

作者: Ximing Qiao,Yukun Yang,Hai Li
公開日: 2019-10-11
更新日: 2019-11-6
所属機関: ECE Department, Duke University
所属の国: United States of America
会議名: Conference on Neural Information Processing Systems (NeurIPS)

AIにより推定されたラベル

生成的敵対ネットワークバックドア攻撃攻撃の評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of all possible backdoor triggers. An original trigger used by an attacker to build the backdoored model represents only a point in the space. It then will be generalized into a distribution of valid triggers, all of which can influence the backdoored model. Thus, previous methods that model only one point of the trigger distribution is not sufficient. Getting the entire trigger distribution, e.g., via generative modeling, is a key to effective defense. However, existing generative modeling techniques for image generation are not applicable to the backdoor scenario as the trigger distribution is completely unknown. In this work, we propose max-entropy staircase approximator (MESA), an algorithm for high-dimensional sampling-free generative modeling and use it to recover the trigger distribution. We also develop a defense technique to remove the triggers from the backdoored model. Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method.