AIセキュリティポータル K Program
DePrompt: Desensitization and Evaluation of Personal Identifiable Information in Large Language Model Prompts
Share
Abstract
Prompt serves as a crucial link in interacting with large language models (LLMs), widely impacting the accuracy and interpretability of model outputs. However, acquiring accurate and high-quality responses necessitates precise prompts, which inevitably pose significant risks of personal identifiable information (PII) leakage. Therefore, this paper proposes DePrompt, a desensitization protection and effectiveness evaluation framework for prompt, enabling users to safely and transparently utilize LLMs. Specifically, by leveraging large model fine-tuning techniques as the underlying privacy protection method, we integrate contextual attributes to define privacy types, achieving high-precision PII entity identification. Additionally, through the analysis of key features in prompt desensitization scenarios, we devise adversarial generative desensitization methods that retain important semantic content while disrupting the link between identifiers and privacy attributes. Furthermore, we present utility evaluation metrics for prompt to better gauge and balance privacy and usability. Our framework is adaptable to prompts and can be extended to text usability-dependent scenarios. Through comparison with benchmarks and other model methods, experimental evaluations demonstrate that our desensitized prompt exhibit superior privacy protection utility and model inference results.
Language models are few-shot learners
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei
Published: 2020
Sensitive information detection adopting named entity recognition: A proposed methodology
Lelio Campanile, Maria Stella de Biase, Stefano Marrone, Fiammetta Marulli, Mariapia Raimondo, Laura Verde
Published: 2022
Enhancing chat language models by scaling high-quality instructional conversations
Ding, N., Chen, Y., Xu, B., Qin, Y., Hu, S., Liu, Z., Sun, M., Zhou, B.
Published: 2023
Differential privacy
Cynthia Dwork
Published: 2006
Models and methods for privacy-preserving data analysis and publishing
Johannes Gehrke
Published: 2006
How to play any mental game, or a completeness theorem for protocols with honest majority
O. Goldreich, S. Micali, A. Wigderson
Published: 2019
Named entity recognition in clinical text based on capsule-lstm for privacy protection
Changjian Liu, Jiaming Li, Yuhan Liu, Jiachen Du, Buzhou Tang, Ruifeng Xu
Published: 2019
An automatic privacy-aware framework for text data in online social network based on a multi-deep learning model
Gan Liu, Xiongtao Sun, Yiran Li, Hui Li, Shuchang Zhao, Zhen Guo
Published: 2023
Prompt Injection attack against LLM-integrated Applications
Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, Yang Liu
Published: 2023.6.9
Analyzing leakage of personally identifiable information in language models
Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, Lukas Wutschitz, Santiago Zanella-Béguelin
Published: 2023
Automated anonymization of text documents
Nuno Mamede, Jorge Baptista, Francisco Dias
Published: 2016
Automatic evaluation of disclosure risks of text anonymization methods
Benet Manzanares-Salor, David Sánchez, Pierre Lison
Published: 2022
Text de-identification for privacy protection: A study of its impact on clinical text information content
Stéphane M. Meystre, Óscar Ferrández, F. Jeffrey Friedlin, Brett R. South, Shuying Shen, Matthew H. Samore
Published: 2014
Textrank: Bringing order into text
Rada Mihalcea, Paul Tarau
Published: 2004
Data anonymization evaluation for big data and iot environment
Chunchun Ni, Li Shan Cang, Prosanta Gope, Geyong Min
Published: 2022
Bertlstm model for sarcasm detection in code-mixed social media post
Rajnish Pandey, Jyoti Prakash Singh
Published: 2023
The text anonymization benchmark (TAB): A dedicated corpus and evaluation framework for text anonymization
Ildikó Pilán, Pierre Lison, Lilja Øvrelid, Anthi Papadopoulou, David Sánchez, Montserrat Batet
Published: 2022
Grips: Gradient-free, edit-based instruction search for prompting large language models
Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
Published: 2023
Sentence-bert: Sentence embeddings using siamese bert-networks
Nils Reimers, Iryna Gurevych
Published: 2019
Protecting respondents identities in microdata release
P. Samarati
Published: 2001
Swype.com dataset
Srikanth Srinivas
Published: 2023
Recent trends in deep learning based natural language processing
Tom Young, Devamanyu Hazarika, Soujanya Poria, Erik Cambria
Published: 2018
Share