Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks

TOP 文献データベース Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2307.16331

PDF

https://arxiv.org/pdf/2307.16331

文献情報

作者: Ashish Hooda;Neal Mangaokar;Ryan Feng;Kassem Fawaz;Somesh Jha;Atul Prakash
公開日: 2023-7-31
所属機関: University of Wisconsin-Madison
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

敵対的スペクトル攻撃検出サイバーセキュリティ透かしの耐久性

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Adversarial examples threaten the integrity of machine learning systems with alarming success rates even under constrained black-box conditions. Stateful defenses have emerged as an effective countermeasure, detecting potential attacks by maintaining a buffer of recent queries and detecting new queries that are too similar. However, these defenses fundamentally pose a trade-off between attack detection and false positive rates, and this trade-off is typically optimized by hand-picking feature extractors and similarity thresholds that empirically work well. There is little current understanding as to the formal limits of this trade-off and the exact properties of the feature extractors/underlying problem domain that influence it. This work aims to address this gap by offering a theoretical characterization of the trade-off between detection and false positive rates for stateful defenses. We provide upper bounds for detection rates of a general class of feature extractors and analyze the impact of this trade-off on the convergence of black-box attacks. We then support our theoretical findings with empirical evaluations across multiple datasets and stateful defenses.