These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Membership Inference Attacks (MIAs) on pre-trained Large Language Models
(LLMs) aim at determining if a data point was part of the model's training set.
Prior MIAs that are built for classification models fail at LLMs, due to
ignoring the generative nature of LLMs across token sequences. In this paper,
we present a novel attack on pre-trained LLMs that adapts MIA statistical tests
to the perplexity dynamics of subsequences within a data point. Our method
significantly outperforms prior approaches, revealing context-dependent
memorization patterns in pre-trained LLMs.