Prompt Injection

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Authors: Anselm Paulus, Arman Zharmagambetov, Chuan Guo, Brandon Amos, Yuandong Tian | Published: 2024-04-21
LLM Security
Prompt Injection
Prompt Engineering

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

Authors: Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, David Molnar, Spencer Whitman, Joshua Saxe | Published: 2024-04-19
LLM Security
Cybersecurity
Prompt Injection

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

Authors: Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei Zhang, Wei Chen | Published: 2024-04-12
LLM Performance Evaluation
Prompt Injection
Evaluation Method

Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward

Authors: Xuan Xie, Jiayang Song, Zhehua Zhou, Yuheng Huang, Da Song, Lei Ma | Published: 2024-04-12
LLM Security
LLM Performance Evaluation
Prompt Injection

Subtoxic Questions: Dive Into Attitude Change of LLM’s Response in Jailbreak Attempts

Authors: Tianyu Zhang, Zixuan Zhao, Jiaqi Huang, Jingyu Hua, Sheng Zhong | Published: 2024-04-12
LLM Security
Prompt Injection
Prompt Engineering

Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs

Authors: Bibek Upadhayay, Vahid Behzadan | Published: 2024-04-09
LLM Security
Prompt Injection
Attack Method

Rethinking How to Evaluate Language Model Jailbreak

Authors: Hongyu Cai, Arjun Arunasalam, Leo Y. Lin, Antonio Bianchi, Z. Berkay Celik | Published: 2024-04-09 | Updated: 2024-05-07
Prompt Injection
Classification of Malicious Actors
Evaluation Method

Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security

Authors: Yihe Fan, Yuxin Cao, Ziyu Zhao, Ziyao Liu, Shaofeng Li | Published: 2024-04-08 | Updated: 2024-08-11
LLM Security
Prompt Injection
Threat modeling

Initial Exploration of Zero-Shot Privacy Utility Tradeoffs in Tabular Data Using GPT-4

Authors: Bishwas Mandal, George Amariucai, Shuangqing Wei | Published: 2024-04-07
Data Privacy Assessment
Privacy Protection Method
Prompt Injection

Fine-Tuning, Quantization, and LLMs: Navigating Unintended Outcomes

Authors: Divyanshu Kumar, Anurakt Kumar, Sahil Agarwal, Prashanth Harshangi | Published: 2024-04-05 | Updated: 2024-09-09
LLM Security
Prompt Injection
Safety Alignment