A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks

TOP 文献データベース A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1811.10828

PDF

https://arxiv.org/pdf/1811.10828

文献情報

作者: Jinghui Chen,Dongruo Zhou,Jinfeng Yi,Quanquan Gu
公開日: 2018-11-27
更新日: 2019-9-15
所属機関: Department of Computer Science, University of Virginia
所属の国: United States of America
会議名: AAAI Conference on Artificial Intelligence (AAAI)

AIにより推定されたラベル

バックドアモデルの検知最適化アルゴリズムの選択と評価モデル性能評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Depending on how much information an adversary can access to, adversarial attacks can be classified as white-box attack and black-box attack. For white-box attack, optimization-based attack algorithms such as projected gradient descent (PGD) can achieve relatively high attack success rates within moderate iterates. However, they tend to generate adversarial examples near or upon the boundary of the perturbation set, resulting in large distortion. Furthermore, their corresponding black-box attack algorithms also suffer from high query complexities, thereby limiting their practical usefulness. In this paper, we focus on the problem of developing efficient and effective optimization-based adversarial attack algorithms. In particular, we propose a novel adversarial attack framework for both white-box and black-box settings based on a variant of Frank-Wolfe algorithm. We show in theory that the proposed attack algorithms are efficient with an $O(1/\sqrt{T})$ convergence rate. The empirical results of attacking the ImageNet and MNIST datasets also verify the efficiency and effectiveness of the proposed algorithms. More specifically, our proposed algorithms attain the best attack performances in both white-box and black-box attacks among all baselines, and are more time and query efficient than the state-of-the-art.