Backdoor Attacks for In-Context Learning with Language Models Authors: Nikhil Kandpal, Matthew Jagielski, Florian Tramèr, Nicholas Carlini | Published: 2023-07-27 LLM SecurityBackdoor AttackPrompt Injection 2023.07.27 2025.05.28 Literature Database
Unveiling Security, Privacy, and Ethical Concerns of ChatGPT Authors: Xiaodong Wu, Ran Duan, Jianbing Ni | Published: 2023-07-26 LLM SecurityPrompt InjectionInappropriate Content Generation 2023.07.26 2025.05.28 Literature Database
Getting pwn’d by AI: Penetration Testing with Large Language Models Authors: Andreas Happe, Jürgen Cito | Published: 2023-07-24 | Updated: 2023-08-17 LLM SecurityPrompt InjectionPenetration Testing Methods 2023.07.24 2025.05.28 Literature Database
The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention Authors: Navid Ayoobi, Sadat Shahriar, Arjun Mukherjee | Published: 2023-07-21 Data GenerationPrompt InjectionAnalysis of Detection Methods 2023.07.21 2025.05.28 Literature Database
A LLM Assisted Exploitation of AI-Guardian Authors: Nicholas Carlini | Published: 2023-07-20 Prompt InjectionMembership InferenceWatermark Robustness 2023.07.20 2025.05.28 Literature Database
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots Authors: Gelei Deng, Yi Liu, Yuekang Li, Kailong Wang, Ying Zhang, Zefeng Li, Haoyu Wang, Tianwei Zhang, Yang Liu | Published: 2023-07-16 | Updated: 2023-10-25 Data LeakagePrompt InjectionWatermark Robustness 2023.07.16 2025.05.28 Literature Database
Time for aCTIon: Automated Analysis of Cyber Threat Intelligence in the Wild Authors: Giuseppe Siracusano, Davide Sanvito, Roberto Gonzalez, Manikantan Srinivasan, Sivakaman Kamatchi, Wataru Takahashi, Masaru Kawakita, Takahiro Kakumaru, Roberto Bifulco | Published: 2023-07-14 Dataset GenerationPrompt InjectionAttack Pattern Extraction 2023.07.14 2025.05.28 Literature Database
Understanding Multi-Turn Toxic Behaviors in Open-Domain Chatbots Authors: Bocheng Chen, Guangjing Wang, Hanqing Guo, Yuanda Wang, Qiben Yan | Published: 2023-07-14 Prompt InjectionDialogue SystemAttack Evaluation 2023.07.14 2025.05.28 Literature Database
Effective Prompt Extraction from Language Models Authors: Yiming Zhang, Nicholas Carlini, Daphne Ippolito | Published: 2023-07-13 | Updated: 2024-08-07 Prompt InjectionPrompt leakingDialogue System 2023.07.13 2025.05.28 Literature Database
Jailbroken: How Does LLM Safety Training Fail? Authors: Alexander Wei, Nika Haghtalab, Jacob Steinhardt | Published: 2023-07-05 Security AssurancePrompt InjectionAdversarial Attack Methods 2023.07.05 2025.05.28 Literature Database