"What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models

TOP 文献データベース "What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2102.05104

PDF

https://arxiv.org/pdf/2102.05104

文献情報

作者: Sahar Abdelnabi;Mario Fritz
公開日: 2021-2-10
更新日: 2021-3-9
所属機関: CISPA Helmholtz Center for Information Security
所属の国: Germany
会議名

AIにより推定されたラベル

ポイズニングモデル性能評価攻撃手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Machine learning models are now widely deployed in real-world applications. However, the existence of adversarial examples has been long considered a real threat to such models. While numerous defenses aiming to improve the robustness have been proposed, many have been shown ineffective. As these vulnerabilities are still nowhere near being eliminated, we propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models. Instead of training a single partially-robust model, one could train a set of same-functionality, yet, adversarially-disjoint models with minimal in-between attack transferability. These models could then be randomly and individually deployed, such that accessing one of them minimally affects the others. Our experiments on CIFAR-10 and a wide range of attacks show that we achieve a significantly lower attack transferability across our disjoint models compared to a baseline of ensemble diversity. In addition, compared to an adversarially trained set, we achieve a higher average robust accuracy while maintaining the accuracy of clean examples.