These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Fine-tuning large language models (LLMs) has become an essential strategy for
adapting them to specialized tasks; however, this process introduces
significant privacy challenges, as sensitive training data may be inadvertently
memorized and exposed. Although differential privacy (DP) offers strong
theoretical guarantees against such leakage, its empirical privacy
effectiveness on LLMs remains unclear, especially under different fine-tuning
methods. In this paper, we systematically investigate the impact of DP across
fine-tuning methods and privacy budgets, using both data extraction and
membership inference attacks to assess empirical privacy risks. Our main
findings are as follows: (1) Differential privacy reduces model utility, but
its impact varies significantly across different fine-tuning methods. (2)
Without DP, the privacy risks of models fine-tuned with different approaches
differ considerably. (3) When DP is applied, even a relatively high privacy
budget can substantially lower privacy risk. (4) The privacy-utility trade-off
under DP training differs greatly among fine-tuning methods, with some methods
being unsuitable for DP due to severe utility degradation. Our results provide
practical guidance for privacy-conscious deployment of LLMs and pave the way
for future research on optimizing the privacy-utility trade-off in fine-tuning
methodologies.