These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The popularity of Machine Learning as a Service (MLaaS) has led to increased
concerns about Model Stealing Attacks (MSA), which aim to craft a clone model
by querying MLaaS. Currently, most research on MSA assumes that MLaaS can
provide soft labels and that the attacker has a proxy dataset with a similar
distribution. However, this fails to encapsulate the more practical scenario
where only hard labels are returned by MLaaS and the data distribution remains
elusive. Furthermore, most existing work focuses solely on stealing the model
accuracy, neglecting the model robustness, while robustness is essential in
security-sensitive scenarios, e.g., face-scan payment. Notably, improving model
robustness often necessitates the use of expensive techniques such as
adversarial training, thereby further making stealing robustness a more
lucrative prospect. In response to these identified gaps, we introduce a novel
Data-Free Hard-Label Robustness Stealing (DFHL-RS) attack in this paper, which
enables the stealing of both model accuracy and robustness by simply querying
hard labels of the target model without the help of any natural data.
Comprehensive experiments demonstrate the effectiveness of our method. The
clone model achieves a clean accuracy of 77.86% and a robust accuracy of 39.51%
against AutoAttack, which are only 4.71% and 8.40% lower than the target model
on the CIFAR-10 dataset, significantly exceeding the baselines. Our code is
available at: https://github.com/LetheSec/DFHL-RS-Attack.