How to beat a Bayesian adversary

TOP 文献データベース How to beat a Bayesian adversary

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2407.08678

PDF

https://arxiv.org/pdf/2407.08678

文献情報

作者: Zihan Ding;Kexin Jin;Jonas Latz;Chenguang Liu
公開日: 2024-7-12
所属機関: Department of Electrical and Computer Engineering, Princeton University
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

最適化問題敵対的訓練収束分析

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Deep neural networks and other modern machine learning models are often susceptible to adversarial attacks. Indeed, an adversary may often be able to change a model's prediction through a small, directed perturbation of the model's input - an issue in safety-critical applications. Adversarially robust machine learning is usually based on a minmax optimisation problem that minimises the machine learning loss under maximisation-based adversarial attacks. In this work, we study adversaries that determine their attack using a Bayesian statistical approach rather than maximisation. The resulting Bayesian adversarial robustness problem is a relaxation of the usual minmax problem. To solve this problem, we propose Abram - a continuous-time particle system that shall approximate the gradient flow corresponding to the underlying learning problem. We show that Abram approximates a McKean-Vlasov process and justify the use of Abram by giving assumptions under which the McKean-Vlasov process finds the minimiser of the Bayesian adversarial robustness problem. We discuss two ways to discretise Abram and show its suitability in benchmark adversarial deep learning experiments.

外部データセット

MNIST

CIFAR10