Prompt Injection

h4rm3l: A language for Composable Jailbreak Attack Synthesis

Authors: Moussa Koulako Bala Doumbouya, Ananjan Nandi, Gabriel Poesia, Davide Ghilardi, Anna Goldie, Federico Bianchi, Dan Jurafsky, Christopher D. Manning | Published: 2024-08-09 | Updated: 2025-03-25
Watermarking
Prompt Injection
Prompt Engineering

Towards Explainable Network Intrusion Detection using Large Language Models

Authors: Paul R. B. Houssel, Priyanka Singh, Siamak Layeghy, Marius Portmann | Published: 2024-08-08
LLM Performance Evaluation
Network Threat Detection
Prompt Injection

EnJa: Ensemble Jailbreak on Large Language Models

Authors: Jiahao Zhang, Zilong Wang, Ruofan Wang, Xingjun Ma, Yu-Gang Jiang | Published: 2024-08-07
Prompt Injection
Attack Method
Evaluation Method

Compromising Embodied Agents with Contextual Backdoor Attacks

Authors: Aishan Liu, Yuguang Zhou, Xianglong Liu, Tianyuan Zhang, Siyuan Liang, Jiakai Wang, Yanjun Pu, Tianlin Li, Junqi Zhang, Wenbo Zhou, Qing Guo, Dacheng Tao | Published: 2024-08-06
Backdoor Attack
Prompt Injection

Hide and Seek: Fingerprinting Large Language Models with Evolutionary Learning

Authors: Dmitri Iourovitski, Sanat Sharma, Rakshak Talwar | Published: 2024-08-06
LLM Performance Evaluation
Prompt Injection
Model Performance Evaluation

Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models?

Authors: Mohammad Bahrami Karkevandi, Nishant Vishwamitra, Peyman Najafirad | Published: 2024-08-05
Prompt Injection
Reinforcement Learning
Adversarial Example

Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models

Authors: Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Haoyang Li | Published: 2024-08-05 | Updated: 2025-02-12
Prompt Injection
Prompt leaking
Model Evaluation

Automated Phishing Detection Using URLs and Webpages

Authors: Huilin Wang, Bryan Hooi | Published: 2024-08-03 | Updated: 2024-08-16
Phishing Detection
Brand Recognition Problem
Prompt Injection

MCGMark: An Encodable and Robust Online Watermark for Tracing LLM-Generated Malicious Code

Authors: Kaiwen Ning, Jiachi Chen, Qingyuan Zhong, Tao Zhang, Yanlin Wang, Wei Li, Jingwen Zhang, Jianxing Yu, Yuming Feng, Weizhe Zhang, Zibin Zheng | Published: 2024-08-02 | Updated: 2025-04-21
Code Generation
Prompt Injection
Watermark Robustness

Jailbreaking Text-to-Image Models with LLM-Based Agents

Authors: Yingkai Dong, Zheng Li, Xiangtao Meng, Ning Yu, Shanqing Guo | Published: 2024-08-01 | Updated: 2024-09-09
LLM Security
Prompt Injection
Model Performance Evaluation