Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks

TOP 文献データベース Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1906.10908

PDF

https://arxiv.org/pdf/1906.10908

文献情報

作者: Tribhuvanesh Orekondy,Bernt Schiele,Mario Fritz
公開日: 2019-6-26
更新日: 2020-3-3
所属機関: Max Planck Institute for Informatics
所属の国: Germany
会議名: International Conference on Learning Representations (ICLR)

AIにより推定されたラベル

モデルの頑健性保証モデル抽出攻撃の検知攻撃の評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

High-performance Deep Neural Networks (DNNs) are increasingly deployed in many real-world applications e.g., cloud prediction APIs. Recent advances in model functionality stealing attacks via black-box access (i.e., inputs in, predictions out) threaten the business model of such applications, which require a lot of time, money, and effort to develop. Existing defenses take a passive role against stealing attacks, such as by truncating predicted information. We find such passive defenses ineffective against DNN stealing attacks. In this paper, we propose the first defense which actively perturbs predictions targeted at poisoning the training objective of the attacker. We find our defense effective across a wide range of challenging datasets and DNN model stealing attacks, and additionally outperforms existing defenses. Our defense is the first that can withstand highly accurate model stealing attacks for tens of thousands of queries, amplifying the attacker's error rate up to a factor of 85$\times$ with minimal impact on the utility for benign users.