SGBA: A Stealthy Scapegoat Backdoor Attack against Deep Neural Networks

TOP 文献データベース SGBA: A Stealthy Scapegoat Backdoor Attack against Deep Neural Networks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2104.01026

PDF

https://arxiv.org/pdf/2104.01026

文献情報

作者: Ying He;Zhili Shen;Chang Xia;Jingyu Hua;Wei Tong;Sheng Zhong
公開日: 2021-4-2
更新日: 2022-5-16
所属機関
所属の国
会議名

AIにより推定されたラベル

バックドア攻撃手法敵対的サンプルポイズニング攻撃

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Outsourced deep neural networks have been demonstrated to suffer from patch-based trojan attacks, in which an adversary poisons the training sets to inject a backdoor in the obtained model so that regular inputs can be still labeled correctly while those carrying a specific trigger are falsely given a target label. Due to the severity of such attacks, many backdoor detection and containment systems have recently, been proposed for deep neural networks. One major category among them are various model inspection schemes, which hope to detect backdoors before deploying models from non-trusted third-parties. In this paper, we show that such state-of-the-art schemes can be defeated by a so-called Scapegoat Backdoor Attack, which introduces a benign scapegoat trigger in data poisoning to prevent the defender from reversing the real abnormal trigger. In addition, it confines the values of network parameters within the same variances of those from clean model during training, which further significantly enhances the difficulty of the defender to learn the differences between legal and illegal models through machine-learning approaches. Our experiments on 3 popular datasets show that it can escape detection by all five state-of-the-art model inspection schemes. Moreover, this attack brings almost no side-effects on the attack effectiveness and guarantees the universal feature of the trigger compared with original patch-based trojan attacks.