CaBaGe: Data-Free Model Extraction using ClAss BAlanced Generator Ensemble

TOP Literature Database CaBaGe: Data-Free Model Extraction using ClAss BAlanced Generator Ensemble

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2409.10643

PDF

https://arxiv.org/pdf/2409.10643

Paper Information

Author: Jonathan Rosenthal;Shanchao Liang;Kevin Zhang;Lin Tan
Published: 9-17-2024
Affiliation: Purdue University
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Model Extraction Attack Training Data Extraction Method Dataset Generation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Machine Learning as a Service (MLaaS) is often provided as a pay-per-query, black-box system to clients. Such a black-box approach not only hinders open replication, validation, and interpretation of model results, but also makes it harder for white-hat researchers to identify vulnerabilities in the MLaaS systems. Model extraction is a promising technique to address these challenges by reverse-engineering black-box models. Since training data is typically unavailable for MLaaS models, this paper focuses on the realistic version of it: data-free model extraction. We propose a data-free model extraction approach, CaBaGe, to achieve higher model extraction accuracy with a small number of queries. Our innovations include (1) a novel experience replay for focusing on difficult training samples; (2) an ensemble of generators for steadily producing diverse synthetic data; and (3) a selective filtering process for querying the victim model with harder, more balanced samples. In addition, we create a more realistic setting, for the first time, where the attacker has no knowledge of the number of classes in the victim training data, and create a solution to learn the number of classes on the fly. Our evaluation shows that CaBaGe outperforms existing techniques on seven datasets -- MNIST, FMNIST, SVHN, CIFAR-10, CIFAR-100, ImageNet-subset, and Tiny ImageNet -- with an accuracy improvement of the extracted models by up to 43.13%. Furthermore, the number of queries required to extract a clone model matching the final accuracy of prior work is reduced by up to 75.7%.

External Datasets

MNIST

FMNIST

SVHN

CIFAR-10

CIFAR-100

ImageNet-subset

Tiny ImageNet