Membership Inference Attacks by Exploiting Loss Trajectory

TOP Literature Database Membership Inference Attacks by Exploiting Loss Trajectory

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2208.14933

PDF

https://arxiv.org/pdf/2208.14933

Paper Information

Author: Yiyong Liu;Zhengyu Zhao;Michael Backes;Yang Zhang
Published: 9-1-2022
Affiliation: CISPA Helmholtz Center for Information Security
Country: Germany
Conference: Annual ACM Conference on Computer and Communications Security (CCS)

Labels Estimated by AI

Adversarial attack Membership Inference Model Architecture

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Machine learning models are vulnerable to membership inference attacks in which an adversary aims to predict whether or not a particular sample was contained in the target model's training dataset. Existing attack methods have commonly exploited the output information (mostly, losses) solely from the given target model. As a result, in practical scenarios where both the member and non-member samples yield similarly small losses, these methods are naturally unable to differentiate between them. To address this limitation, in this paper, we propose a new attack method, called \system, which can exploit the membership information from the whole training process of the target model for improving the attack performance. To mount the attack in the common black-box setting, we leverage knowledge distillation, and represent the membership information by the losses evaluated on a sequence of intermediate models at different distillation epochs, namely \emph{distilled loss trajectory}, together with the loss from the given target model. Experimental results over different datasets and model architectures demonstrate the great advantage of our attack in terms of different metrics. For example, on CINIC-10, our attack achieves at least 6$\times$ higher true-positive rate at a low false-positive rate of 0.1\% than existing methods. Further analysis demonstrates the general effectiveness of our attack in more strict scenarios.

External Datasets

CIFAR-10

CINIC-10

CIFAR-100

GTSRB

Purchase

Location

News