Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch

TOP 文献データベース Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2106.08970

PDF

https://arxiv.org/pdf/2106.08970

文献情報

作者: Hossein Souri;Liam Fowl;Rama Chellappa;Micah Goldblum;Tom Goldstein
公開日: 2021-6-17
更新日: 2022-10-14
所属機関: Johns Hopkins University
所属の国: United States of America
会議名: Conference on Neural Information Processing Systems (NeurIPS)

AIにより推定されたラベル

機械学習バックドア攻撃ポイズニング

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

As the curation of data for machine learning becomes increasingly automated, dataset tampering is a mounting threat. Backdoor attackers tamper with training data to embed a vulnerability in models that are trained on that data. This vulnerability is then activated at inference time by placing a "trigger" into the model's input. Typical backdoor attacks insert the trigger directly into the training data, although the presence of such an attack may be visible upon inspection. In contrast, the Hidden Trigger Backdoor Attack achieves poisoning without placing a trigger into the training data at all. However, this hidden trigger attack is ineffective at poisoning neural networks trained from scratch. We develop a new hidden trigger attack, Sleeper Agent, which employs gradient matching, data selection, and target model re-training during the crafting process. Sleeper Agent is the first hidden trigger backdoor attack to be effective against neural networks trained from scratch. We demonstrate its effectiveness on ImageNet and in black-box settings. Our implementation code can be found at https://github.com/hsouri/Sleeper-Agent.