These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Model stealing (MS) involves querying and observing the output of a machine
learning model to steal its capabilities. The quality of queried data is
crucial, yet obtaining a large amount of real data for MS is often challenging.
Recent works have reduced reliance on real data by using generative models.
However, when high-dimensional query data is required, these methods are
impractical due to the high costs of querying and the risk of model collapse.
In this work, we propose using sample gradients (SG) to enhance the utility of
each real sample, as SG provides crucial guidance on the decision boundaries of
the victim model. However, utilizing SG in the model stealing scenario faces
two challenges: 1. Pixel-level gradient estimation requires extensive query
volume and is susceptible to defenses. 2. The estimation of sample gradients
has a significant variance. This paper proposes Superpixel Sample Gradient
stealing (SPSG) for model stealing under the constraint of limited real
samples. With the basic idea of imitating the victim model's low-variance
patch-level gradients instead of pixel-level gradients, SPSG achieves efficient
sample gradient estimation through two steps. First, we perform patch-wise
perturbations on query images to estimate the average gradient in different
regions of the image. Then, we filter the gradients through a threshold
strategy to reduce variance. Exhaustive experiments demonstrate that, with the
same number of real samples, SPSG achieves accuracy, agreements, and
adversarial success rate significantly surpassing the current state-of-the-art
MS methods. Codes are available at https://github.com/zyl123456aB/SPSG_attack.