“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models Authors: Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang | Published: 2023-08-07 | Updated: 2024-05-15 LLM SecurityCharacter Role ActingPrompt Injection 2023.08.07 2025.05.28 Literature Database
Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing Authors: Wai Man Si, Michael Backes, Yang Zhang | Published: 2023-08-07 WatermarkingPrompt InjectionChallenges of Generative Models 2023.08.07 2025.05.28 Literature Database
PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification Authors: Hongwei Yao, Jian Lou, Kui Ren, Zhan Qin | Published: 2023-08-05 | Updated: 2023-11-28 Soft Prompt OptimizationPrompt InjectionWatermark Robustness 2023.08.05 2025.05.28 Literature Database
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection Authors: Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin | Published: 2023-07-31 | Updated: 2024-04-03 LLM SecuritySystem Prompt GenerationPrompt Injection 2023.07.31 2025.05.28 Literature Database
Universal and Transferable Adversarial Attacks on Aligned Language Models Authors: Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson | Published: 2023-07-27 | Updated: 2023-12-20 LLM SecurityPrompt InjectionInappropriate Content Generation 2023.07.27 2025.05.28 Literature Database
Backdoor Attacks for In-Context Learning with Language Models Authors: Nikhil Kandpal, Matthew Jagielski, Florian Tramèr, Nicholas Carlini | Published: 2023-07-27 LLM SecurityBackdoor AttackPrompt Injection 2023.07.27 2025.05.28 Literature Database
Unveiling Security, Privacy, and Ethical Concerns of ChatGPT Authors: Xiaodong Wu, Ran Duan, Jianbing Ni | Published: 2023-07-26 LLM SecurityPrompt InjectionInappropriate Content Generation 2023.07.26 2025.05.28 Literature Database
Getting pwn’d by AI: Penetration Testing with Large Language Models Authors: Andreas Happe, Jürgen Cito | Published: 2023-07-24 | Updated: 2023-08-17 LLM SecurityPrompt InjectionPenetration Testing Methods 2023.07.24 2025.05.28 Literature Database
The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention Authors: Navid Ayoobi, Sadat Shahriar, Arjun Mukherjee | Published: 2023-07-21 Data GenerationPrompt InjectionAnalysis of Detection Methods 2023.07.21 2025.05.28 Literature Database
A LLM Assisted Exploitation of AI-Guardian Authors: Nicholas Carlini | Published: 2023-07-20 Prompt InjectionMembership InferenceWatermark Robustness 2023.07.20 2025.05.28 Literature Database