Spanning Attack: Reinforce Black-box Attacks with Unlabeled Data

Authors: Lu Wang, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, Yuan Jiang | Published: 2020-05-11 | Updated: 2020-11-10

2020.05.112025.05.28

Authors: Lu Wang, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, Yuan Jiang
Published: 2020-05-11 | Updated: 2020-11-10

Source: https://arxiv.org/abs/2005.04871

PDF: https://arxiv.org/pdf/2005.04871

Labels Predicted by AI

Adversarial Training Algorithm Attack Detection

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Adversarial black-box attacks aim to craft adversarial perturbations by querying input-output pairs of machine learning models. They are widely used to evaluate the robustness of pre-trained models. However, black-box attacks often suffer from the issue of query inefficiency due to the high dimensionality of the input space, and therefore incur a false sense of model robustness. In this paper, we relax the conditions of the black-box threat model, and propose a novel technique called the spanning attack. By constraining adversarial perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled dataset, the spanning attack significantly improves the query efficiency of a wide variety of existing black-box attacks. Extensive experiments show that the proposed method works favorably in both soft-label and hard-label black-box attacks. Our code is available at https://github.com/wangwllu/spanning_attack.