Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

TOP 文献データベース Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1805.12185

PDF

https://arxiv.org/pdf/1805.12185

文献情報

作者: Kang Liu,Brendan Dolan-Gavitt,Siddharth Garg
公開日: 2018-5-31
所属機関: New York University
所属の国: United States of America
会議名

AIにより推定されたラベル

深層学習バックドアモデルの検知攻撃手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.

外部データセット

YouTube Aligned Face dataset

speech recognition dataset

U.S. traffic signs dataset