Prompt Injection

DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection

Authors: Yuliang Yan, Haochun Tang, Shuo Yan, Enyan Dai | Published: 2025-05-22

Fingerprinting Method

Prompt Injection

Model Identification

2025.05.22 2025.05.28

Literature Database

Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses

Authors: Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, Yves-Alexandre de Montjoye | Published: 2025-05-21

Alignment

Prompt Injection

Defense Mechanism

2025.05.21 2025.05.28

Literature Database

sudoLLM : On Multi-role Alignment of Language Models

Authors: Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain | Published: 2025-05-20

Alignment

Prompt Injection

Large Language Model

2025.05.20 2025.05.28

Literature Database

Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs

Authors: Jiawen Wang, Pritha Gupta, Ivan Habernal, Eyke Hüllermeier | Published: 2025-05-20

LLM Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.05.20 2025.05.28

Literature Database

Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion

Authors: Tiehan Cui, Yanxu Mao, Peipei Liu, Congying Liu, Datao You | Published: 2025-05-20

LLM Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.05.20 2025.05.28

Literature Database

PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks

Authors: Guobin Shen, Dongcheng Zhao, Linghao Feng, Xiang He, Jihang Wang, Sicheng Shen, Haibo Tong, Yiting Dong, Jindong Li, Xiang Zheng, Yi Zeng | Published: 2025-05-20 | Updated: 2025-05-22

Disabling Safety Mechanisms of LLM

Prompt Injection

Effectiveness Analysis of Defense Methods

2025.05.20 2025.05.28

Literature Database

Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset

Authors: Sayon Palit, Daniel Woods | Published: 2025-05-19 | Updated: 2025-05-20

LLM Security

Prompt Injection

Attack Method

2025.05.19 2025.05.28

Literature Database

Improving LLM Outputs Against Jailbreak Attacks with Expert Model Integration

Authors: Tatia Tsmindashvili, Ana Kolkhidashvili, Dachi Kurtskhalia, Nino Maghlakelidze, Elene Mekvabishvili, Guram Dentoshvili, Orkhan Shamilov, Zaal Gachechiladze, Steven Saporta, David Dachi Choladze | Published: 2025-05-18 | Updated: 2025-08-11

Prompt Injection

Large Language Model

Performance Evaluation Method

2025.05.18 2025.08.13

Literature Database

MARVEL: Multi-Agent RTL Vulnerability Extraction using Large Language Models

Authors: Luca Collini, Baleegh Ahmad, Joey Ah-kiow, Ramesh Karri | Published: 2025-05-17 | Updated: 2025-06-09

Poisoning attack on RAG

Cyber Threat

Prompt Injection

2025.05.17 2025.06.11

Literature Database

JULI: Jailbreak Large Language Models by Self-Introspection

Authors: Jesson Wang, Zhanhao Hu, David Wagner | Published: 2025-05-17 | Updated: 2025-05-20

API Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.05.17 2025.05.28

Literature Database