These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Numerous studies have highlighted the privacy risks associated with
pretrained large language models. In contrast, our research offers a unique
perspective by demonstrating that pretrained large language models can
effectively contribute to privacy preservation. We propose a locally
differentially private mechanism called DP-Prompt, which leverages the power of
pretrained large language models and zero-shot prompting to counter author
de-anonymization attacks while minimizing the impact on downstream utility.
When DP-Prompt is used with a powerful language model like ChatGPT (gpt-3.5),
we observe a notable reduction in the success rate of de-anonymization attacks,
showing that it surpasses existing approaches by a considerable margin despite
its simpler design. For instance, in the case of the IMDB dataset, DP-Prompt
(with ChatGPT) perfectly recovers the clean sentiment F1 score while achieving
a 46\% reduction in author identification F1 score against static attackers and
a 26\% reduction against adaptive attackers. We conduct extensive experiments
across six open-source large language models, ranging up to 7 billion
parameters, to analyze various effects of the privacy-utility tradeoff.