Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

TOP 文献データベース Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2404.19640

PDF

https://arxiv.org/pdf/2404.19640

文献情報

作者: Yunzhen Feng;Tim G. J. Rudner;Nikolaos Tsilivis;Julia Kempe
公開日: 2024-4-27
所属機関: New York University
所属の国: United States of America
会議名: Trans. Mach. Learn. Res.

AIにより推定されたラベル

敵対的サンプル不確実性の定量化透かし評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we examine this claim. To study the adversarial robustness of BNNs, we investigate whether it is possible to successfully break state-of-the-art BNN inference methods and prediction pipelines using even relatively unsophisticated attacks for three tasks: (1) label prediction under the posterior predictive mean, (2) adversarial example detection with Bayesian predictive uncertainty, and (3) semantic shift detection. We find that BNNs trained with state-of-the-art approximate inference methods, and even BNNs trained with Hamiltonian Monte Carlo, are highly susceptible to adversarial attacks. We also identify various conceptual and experimental errors in previous works that claimed inherent adversarial robustness of BNNs and conclusively demonstrate that BNNs and uncertainty-aware Bayesian prediction pipelines are not inherently robust against adversarial attacks.

外部データセット

MNIST

FashionMNIST

CIFAR-10

SVHN