Prompt Injection

Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses

Authors: Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, Yves-Alexandre de Montjoye | Published: 2025-05-21
Alignment
Prompt Injection
Defense Mechanism

sudoLLM : On Multi-role Alignment of Language Models

Authors: Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain | Published: 2025-05-20
Alignment
Prompt Injection
Large Language Model

Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs

Authors: Jiawen Wang, Pritha Gupta, Ivan Habernal, Eyke Hüllermeier | Published: 2025-05-20
LLM Security
Disabling Safety Mechanisms of LLM
Prompt Injection

Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion

Authors: Tiehan Cui, Yanxu Mao, Peipei Liu, Congying Liu, Datao You | Published: 2025-05-20
LLM Security
Disabling Safety Mechanisms of LLM
Prompt Injection

PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks

Authors: Guobin Shen, Dongcheng Zhao, Linghao Feng, Xiang He, Jihang Wang, Sicheng Shen, Haibo Tong, Yiting Dong, Jindong Li, Xiang Zheng, Yi Zeng | Published: 2025-05-20 | Updated: 2025-05-22
Disabling Safety Mechanisms of LLM
Prompt Injection
Effectiveness Analysis of Defense Methods

Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset

Authors: Sayon Palit, Daniel Woods | Published: 2025-05-19 | Updated: 2025-05-20
LLM Security
Prompt Injection
Attack Method

MARVEL: Multi-Agent RTL Vulnerability Extraction using Large Language Models

Authors: Luca Collini, Baleegh Ahmad, Joey Ah-kiow, Ramesh Karri | Published: 2025-05-17 | Updated: 2025-06-09
Poisoning attack on RAG
Cyber Threat
Prompt Injection

JULI: Jailbreak Large Language Models by Self-Introspection

Authors: Jesson Wang, Zhanhao Hu, David Wagner | Published: 2025-05-17 | Updated: 2025-05-20
API Security
Disabling Safety Mechanisms of LLM
Prompt Injection

Dark LLMs: The Growing Threat of Unaligned AI Models

Authors: Michael Fire, Yitzhak Elbazis, Adi Wasenstein, Lior Rokach | Published: 2025-05-15
Disabling Safety Mechanisms of LLM
Prompt Injection
Large Language Model

Analysing Safety Risks in LLMs Fine-Tuned with Pseudo-Malicious Cyber Security Data

Authors: Adel ElZemity, Budi Arief, Shujun Li | Published: 2025-05-15
LLM Security
Prompt Injection
Large Language Model