Gradient-Free Adversarial Attacks for Bayesian Neural Networks

TOP 文献データベース Gradient-Free Adversarial Attacks for Bayesian Neural Networks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2012.12640

PDF

https://arxiv.org/pdf/2012.12640

文献情報

作者: Matthew Yuan,Matthew Wicker,Luca Laurenti
公開日: 2020-12-23
所属機関: Princeton University
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

防御手法敵対的サンプル攻撃の評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

The existence of adversarial examples underscores the importance of understanding the robustness of machine learning models. Bayesian neural networks (BNNs), due to their calibrated uncertainty, have been shown to posses favorable adversarial robustness properties. However, when approximate Bayesian inference methods are employed, the adversarial robustness of BNNs is still not well understood. In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs. In particular, we consider genetic algorithms, surrogate models, as well as zeroth order optimization methods and adapt them to the goal of finding adversarial examples for BNNs. In an empirical evaluation on the MNIST and Fashion MNIST datasets, we show that for various approximate Bayesian inference methods the usage of gradient-free algorithms can greatly improve the rate of finding adversarial examples compared to state-of-the-art gradient-based methods.