DePrompt: Desensitization and Evaluation of Personal Identifiable Information in Large Language Model Prompts

OpenAI Technical Report

Language models are few-shot learners

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei

Published: 2020

Lecture Notes in Computer Science

Sensitive information detection adopting named entity recognition: A proposed methodology

Lelio Campanile, Maria Stella de Biase, Stefano Marrone, Fiammetta Marulli, Mariapia Raimondo, Laura Verde

Published: 2022

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Enhancing chat language models by scaling high-quality instructional conversations

Ding, N., Chen, Y., Xu, B., Qin, Y., Hu, S., Liu, Z., Sun, M., Zhou, B.

Published: 2023

Lecture Notes in Computer Science

Differential privacy

Cynthia Dwork

Published: 2006

IEEE Computer Society

Models and methods for privacy-preserving data analysis and publishing

Johannes Gehrke

Published: 2006

Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali

How to play any mental game, or a completeness theorem for protocols with honest majority

O. Goldreich, S. Micali, A. Wigderson

Published: 2019

Cryptology ePrint Archive

Optimizations of side-channel attack on AES MixColumns using chosen input

A. Vasselle, A. Wurcker

Published: 2019

How close is chatgpt to human experts? comparison corpus, evaluation, and detection

B. Guo, X. Zhang, Z. Wang, M. Jiang, J. Nie, Y. Ding, J. Yue, Y. Wu

Published: 2023

arXiv

Dp-opt: Make large language model your privacy-preserving prompt engineer

J. Hong, J. T. Wang, C. Zhang, Z. Li, B. Li, Z. Wang

Published: 2023

arxiv

被引用数 3

Conference on Neural Information Processing Systems (NeurIPS)

ProPILE: Probing Privacy Leakage in Large Language Models

Siwon Kim, Sangdoo Yun, Hwaran Lee, Martin Gubri, Sungroh Yoon, Seong Joon Oh

Published: 2023.7.5

The rapid advancement and widespread use of large language models (LLMs) have raised significant concerns regarding the potential leakage of personally identifiable information (PII). These models are often trained on vast quantities of web-collected data, which may inadvertently include sensitive personal data. This paper presents ProPILE, a novel probing tool designed to empower data subjects, or the owners of the PII, with awareness of potential PII leakage in LLM-based services. ProPILE lets data subjects formulate prompts based on their own PII to evaluate the level of privacy intrusion in LLMs. We demonstrate its application on the OPT-1.3B model trained on the publicly available Pile dataset. We show how hypothetical data subjects may assess the likelihood of their PII being included in the Pile dataset being revealed. ProPILE can also be leveraged by LLM service providers to effectively evaluate their own levels of PII leakage with more powerful prompts specifically tuned for their in-house models. This tool represents a pioneering step towards empowering the data subjects for their awareness and control over their own data on the web.

データ漏洩プライバシー侵害プロンプティング戦略

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

The power of scale for parameter-efficient prompt tuning

Brian Lester, Rami Al-Rfou, Noah Constant

Published: 2021

arxiv

被引用数 1

Multi-step Jailbreaking Privacy Attacks on ChatGPT

Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang, Fanpu Meng, Yangqiu Song

Published: 2023.4.11

With the rapid progress of large language models (LLMs), many downstream NLP tasks can be well solved given appropriate prompts. Though model developers and researchers work hard on dialog safety to avoid generating harmful content from LLMs, it is still challenging to steer AI-generated content (AIGC) for the human good. As powerful LLMs are devouring existing text data from various domains (e.g., GPT-3 is trained on 45TB texts), it is natural to doubt whether the private information is included in the training data and what privacy threats can these LLMs and their downstream applications bring. In this paper, we study the privacy threats from OpenAI's ChatGPT and the New Bing enhanced by ChatGPT and show that application-integrated LLMs may cause new privacy threats. To this end, we conduct extensive experiments to support our claims and discuss LLMs' privacy implications.

プロンプトインジェクションプライバシー分析 LLMセキュリティ

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Anonymisation Models for Text Data: State of the art, Challenges and Future Directions

Pierre Lison, Ildikó Pilán, David Sanchez, Montserrat Batet, Lilja Øvrelid

Published: 2021

Lecture Notes in Computer Science

Named entity recognition in clinical text based on capsule-lstm for privacy protection

Changjian Liu, Jiaming Li, Yuhan Liu, Jiachen Du, Buzhou Tang, Ruifeng Xu

Published: 2019

International Journal of Intelligent Systems

An automatic privacy-aware framework for text data in online social network based on a multi-deep learning model

Gan Liu, Xiongtao Sun, Yiran Li, Hui Li, Shuchang Zhao, Zhen Guo

Published: 2023

arxiv

被引用数 12

Prompt Injection attack against LLM-integrated Applications

Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, Yang Liu

Published: 2023.6.9

Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation.

プロンプトインジェクション悪意のあるプロンプト

Jailbreaking chatgpt via prompt engineering: An empirical study

Yi Liu, Gelei Deng, Zhengzi Xu, Yuekang Li, Yaowen Zheng, Ying Zhang, Lida Zhao, Tianwei Zhang, Kailong Wang, Yang Liu

Published: 2023

Deid-gpt: Zero-shot medical text de-identification by gpt-4

Zhengliang Liu, Xiaowei Yu, Lu Zhang, Zihao Wu, Chao Cao, Haixing Dai, Lin Zhao, Wei Liu, Dinggang Shen, Quanzheng Li

Published: 2023

2023 IEEE Symposium on Security and Privacy (SP)

Analyzing leakage of personally identifiable information in language models

Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, Lukas Wutschitz, Santiago Zanella-Béguelin

Published: 2023

IEEE

Automated anonymization of text documents

Nuno Mamede, Jorge Baptista, Francisco Dias

Published: 2016

Springer

Automatic evaluation of disclosure risks of text anonymization methods

Benet Manzanares-Salor, David Sánchez, Pierre Lison

Published: 2022

J. Biomed. Informatics

Text de-identification for privacy protection: A study of its impact on clinical text information content

Stéphane M. Meystre, Óscar Ferrández, F. Jeffrey Friedlin, Brett R. South, Shuying Shen, Matthew H. Samore

Published: 2014

ACL

Textrank: Bringing order into text

Rada Mihalcea, Paul Tarau

Published: 2004

Inf. Sci.

Data anonymization evaluation for big data and iot environment

Chunchun Ni, Li Shan Cang, Prosanta Gope, Geyong Min

Published: 2022

Journal of Intelligent Information Systems

Bertlstm model for sarcasm detection in code-mixed social media post

Rajnish Pandey, Jyoti Prakash Singh

Published: 2023

Instruction tuning with gpt-4

Peng, B., Li, C., He, P., Galley, M., Gao, J.

Published: 2023

Comput. Linguistics

The text anonymization benchmark (TAB): A dedicated corpus and evaluation framework for text anonymization

Ildikó Pilán, Pierre Lison, Lilja Øvrelid, Anthi Papadopoulou, David Sánchez, Montserrat Batet

Published: 2022

Association for Computational Linguistics

Grips: Gradient-free, edit-based instruction search for prompting large language models

Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal

Published: 2023

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing

Sentence-bert: Sentence embeddings using siamese bert-networks

Nils Reimers, Iryna Gurevych

Published: 2019

IEEE transactions on Knowledge and Data Engineering

Protecting respondents identities in microdata release

P. Samarati

Published: 2001

Swype.com dataset

Srikanth Srinivas

Published: 2023

Attention is all you need

VASWANI, A., SHAZEER, N., PARMAR, N., USZKOREIT, J., JONES, L., GOMEZ, A. N., KAISER, L., POLOSUKHIN, I.

Published: 2017

ieee Computational IntelligenCe magazine

Recent trends in deep learning based natural language processing

Tom Young, Devamanyu Hazarika, Soujanya Poria, Erik Cambria

Published: 2018

Privacy-preserving instructions for aligning large language models

Da Yu, Peter Kairouz, Sewoong Oh, Zheng Xu

Published: 2024