These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Significant advancements have recently been made in large language models
represented by GPT models. Users frequently have multi-round private
conversations with cloud-hosted GPT models for task optimization. Yet, this
operational paradigm introduces additional attack surfaces, particularly in
custom GPTs and hijacked chat sessions. In this paper, we introduce a
straightforward yet potent Conversation Reconstruction Attack. This attack
targets the contents of previous conversations between GPT models and benign
users, i.e., the benign users' input contents during their interaction with GPT
models. The adversary could induce GPT models to leak such contents by querying
them with designed malicious prompts. Our comprehensive examination of privacy
risks during the interactions with GPT models under this attack reveals GPT-4's
considerable resilience. We present two advanced attacks targeting improved
reconstruction of past conversations, demonstrating significant privacy leakage
across all models under these advanced techniques. Evaluating various defense
mechanisms, we find them ineffective against these attacks. Our findings
highlight the ease with which privacy can be compromised in interactions with
GPT models, urging the community to safeguard against potential abuses of these
models' capabilities.