Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

Authors: Hongyan Chang, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Reza Shokri | Published: 2024-09-11

2024.09.112025.04.03

Authors: Hongyan Chang, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Reza Shokri
Published: 2024-09-11

Source: https://arxiv.org/abs/2409.13745

PDF: https://arxiv.org/pdf/2409.13745

AIにより推定されたラベル

メンバーシップ推論攻撃手法 LLMセキュリティ

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Prior Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs), adapted from classification model attacks, fail due to ignoring the generative process of LLMs across token sequences. In this paper, we present a novel attack that adapts MIA statistical tests to the perplexity dynamics of subsequences within a data point. Our method significantly outperforms prior loss-based approaches, revealing context-dependent memorization patterns in pre-trained LLMs.