Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

TOP Literature Database Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2409.13745

PDF

https://arxiv.org/pdf/2409.13745

Paper Information

Author: Hongyan Chang,Ali Shahin Shamsabadi,Kleomenis Katevas,Hamed Haddadi,Reza Shokri
Published: 9-11-2024
Updated: 9-16-2025
Affiliation: National University of Singapore
Country: Singapore
Conference: Conference on Empirical Methods in Natural Language Processing (EMNLP)

Labels Estimated by AI

Membership Inference Attack Method LLM Security

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs) aim at determining if a data point was part of the model's training set. Prior MIAs that are built for classification models fail at LLMs, due to ignoring the generative nature of LLMs across token sequences. In this paper, we present a novel attack on pre-trained LLMs that adapts MIA statistical tests to the perplexity dynamics of subsequences within a data point. Our method significantly outperforms prior approaches, revealing context-dependent memorization patterns in pre-trained LLMs.

External Datasets

MIMIR

Pile