These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
This paper studies the challenging black-box adversarial attack that aims to
generate adversarial examples against a black-box model by only using output
feedback of the model to input queries. Some previous methods improve the query
efficiency by incorporating the gradient of a surrogate white-box model into
query-based attacks due to the adversarial transferability. However, the
localized gradient is not informative enough, making these methods still
query-intensive. In this paper, we propose a Prior-guided Bayesian Optimization
(P-BO) algorithm that leverages the surrogate model as a global function prior
in black-box adversarial attacks. As the surrogate model contains rich prior
information of the black-box one, P-BO models the attack objective with a
Gaussian process whose mean function is initialized as the surrogate model's
loss. Our theoretical analysis on the regret bound indicates that the
performance of P-BO may be affected by a bad prior. Therefore, we further
propose an adaptive integration strategy to automatically adjust a coefficient
on the function prior by minimizing the regret bound. Extensive experiments on
image classifiers and large vision-language models demonstrate the superiority
of the proposed algorithm in reducing queries and improving attack success
rates compared with the state-of-the-art black-box attacks. Code is available
at https://github.com/yibo-miao/PBO-Attack.