Improving Black-box Adversarial Attacks with a Transfer-based Prior

Authors: Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu | Published: 2019-06-17 | Updated: 2020-07-26

2019.06.172025.05.28

Authors: Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu
Published: 2019-06-17 | Updated: 2020-07-26

Source: https://arxiv.org/abs/1906.06919

PDF: https://arxiv.org/pdf/1906.06919

Labels Predicted by AI

Adversarial Perturbation Techniques Poisoning Optimization Problem

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

We consider the black-box adversarial setting, where the adversary has to generate adversarial perturbations without access to the target models to compute gradients. Previous methods tried to approximate the gradient either by using a transfer gradient of a surrogate white-box model, or based on the query feedback. However, these methods often suffer from low attack success rates or poor query efficiency since it is non-trivial to estimate the gradient in a high-dimensional space with limited information. To address these problems, we propose a prior-guided random gradient-free (P-RGF) method to improve black-box adversarial attacks, which takes the advantage of a transfer-based prior and the query information simultaneously. The transfer-based prior given by the gradient of a surrogate model is appropriately integrated into our algorithm by an optimal coefficient derived by a theoretical analysis. Extensive experiments demonstrate that our method requires much fewer queries to attack black-box models with higher success rates compared with the alternative state-of-the-art methods.