Dataset Inference: Ownership Resolution in Machine Learning

TOP Literature Database Dataset Inference: Ownership Resolution in Machine Learning

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2104.10706

PDF

https://arxiv.org/pdf/2104.10706

Paper Information

Author: Pratyush Maini;Mohammad Yaghini;Nicolas Papernot
Published: 4-22-2021
Affiliation: IIT Delhi
Country: India
Conference: International Conference on Learning Representations (ICLR)

Labels Estimated by AI

Intellectual Property Protection Data Privacy Assessment Statistical Hypothesis Testing

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

With increasingly more data and computation involved in their training, machine learning models constitute valuable intellectual property. This has spurred interest in model stealing, which is made more practical by advances in learning with partial, little, or no supervision. Existing defenses focus on inserting unique watermarks in a model's decision surface, but this is insufficient: the watermarks are not sampled from the training distribution and thus are not always preserved during model stealing. In this paper, we make the key observation that knowledge contained in the stolen model's training set is what is common to all stolen copies. The adversary's goal, irrespective of the attack employed, is always to extract this knowledge or its by-products. This gives the original model's owner a strong advantage over the adversary: model owners have access to the original training data. We thus introduce $dataset$ $inference$, the process of identifying whether a suspected model copy has private knowledge from the original model's dataset, as a defense against model stealing. We develop an approach for dataset inference that combines statistical testing with the ability to estimate the distance of multiple data points to the decision boundary. Our experiments on CIFAR10, SVHN, CIFAR100 and ImageNet show that model owners can claim with confidence greater than 99% that their model (or dataset as a matter of fact) was stolen, despite only exposing 50 of the stolen model's training points. Dataset inference defends against state-of-the-art attacks even when the adversary is adaptive. Unlike prior work, it does not require retraining or overfitting the defended model.