AIセキュリティポータル K Program
Locally Differentially Private In-Context Learning
Share
Abstract
Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference attacks (MIA) and prompt leaking attacks. In order to deal with this problem, we treat LLMs as untrusted in privacy and propose a locally differentially private framework of in-context learning(LDP-ICL) in the settings where labels are sensitive. Considering the mechanisms of in-context learning in Transformers by gradient descent, we provide an analysis of the trade-off between privacy and utility in such LDP-ICL for classification. Moreover, we apply LDP-ICL to the discrete distribution estimation problem. In the end, we perform several experiments to demonstrate our analysis results.
Emergent and predictable memorization in large language models
Stella Biderman, USVSN Sai Prashanth, Lintang Sutawika, Hailey Schoelkopf, Quentin Anthony, Shivanshu Purohit, Edward Raf
Published: 2023
Language models are few-shot learners
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei
Published: 2020
Sanitizing sentence embeddings (and labels) for local differential privacy
Minxin Du, Xiang Yue, Sherman SM Chow, Huan Sun
Published: 2023
Local privacy and statistical minimax rates
John C Duchi, Michael I Jordan, Martin J Wainwright
Published: 2013
Calibrating noise to sensitivity in private data analysis
Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith
Published: 2006
Deep learning with label differential privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, Chiyuan Zhang
Published: 2021
The dual form of neural networks revisited: Connecting test time predictions to training patterns via spotlights of attention
Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber
Published: 2022
The dual form of neural networks revisited: Connecting test time predictions to training patterns via spotlights of attention
Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber
Published: 2022
Discrete distribution estimation under local privacy
Peter Kairouz, Keith Bonawitz, Daniel Ramage
Published: 2016
What can we learn privately?
Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, Adam Smith
Published: 2011
How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven
R. T. McCoy, P. Smolensky, T. Linzen, J. Gao, A. Celikyilmaz
Published: 2023
Rethinking the role of demonstrations: What makes in-context learning work?
Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
Published: 2022
Samsung fab data leak: How chatgpt exposed sensitive information
Robin Mitchell
Published: 2023
ETHOS: a multi-label hate speech detection dataset
Ioannis Mollas, Zoe Chrysopoulou, Stamatis Karlos, Grigorios Tsoumakas
Published: 2022
A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts
Bo Pang, Lillian Lee
Published: 2004
Learning to retrieve prompts for in-context learning
Ohad Rubin, Jonathan Herzig, Jonathan Berant
Published: 2022
Recursive deep models for semantic compositionality over a sentiment treebank
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., Potts, C.
Published: 2013
Transformers learn in-context by gradient descent
Johannes Von Oswald, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov
Published: 2023
Randomized response: a survey technique for eliminating evasive answer bias
Warner, S. L.
Published: 1965
Chain-of-thought prompting elicits reasoning in large language models
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, D. Zhou
Published: 2023
Larger language models do in-context learning differently
Jerry Wei, Jason Wei, Yi Tay, Dustin Tran, Albert Webson, Yifeng Lu, Xinyun Chen, Hanxiao Liu, Da Huang, Denny Zhou, Tengyu Ma
Published: 2023
Active example selection for in-context learning
Yiming Zhang, Shi Feng, Chenhao Tan
Published: 2022
Calibrate before use: Improving few-shot performance of language models
Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
Published: 2021
HyperTransformer: Model generation for supervised and semi-supervised few-shot learning
Andrey Zhmoginov, Mark Sandler, Maksym Vladymyrov
Published: 2022
Share