Prompt Injection

Safeguarding System Prompts for LLMs

Authors: Zhifeng Jiang, Zhihua Jin, Guoliang He | Published: 2024-12-18 | Updated: 2025-01-09
LLM Performance Evaluation
Prompt Injection
Defense Method

Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection

Authors: Ira Ceka, Feitong Qiao, Anik Dey, Aastha Valecha, Gail Kaiser, Baishakhi Ray | Published: 2024-12-16 | Updated: 2025-01-18
LLM Performance Evaluation
Prompting Strategy
Prompt Injection

Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models

Authors: Ma Teng, Jia Xiaojun, Duan Ranjie, Li Xinfeng, Huang Yihao, Chu Zhixuan, Liu Yang, Ren Wenqi | Published: 2024-12-08 | Updated: 2025-01-03
Content Moderation
Prompt Injection
Attack Method

ChatNVD: Advancing Cybersecurity Vulnerability Assessment with Large Language Models

Authors: Shivansh Chopra, Hussain Ahmad, Diksha Goel, Claudia Szabo | Published: 2024-12-06 | Updated: 2025-05-20
Text Generation Method
Prompt Injection
Computational Efficiency

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Authors: Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao | Published: 2024-11-29 | Updated: 2025-01-17
Prompt Injection
Safety Alignment

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Authors: Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh, Tianrui Guan, Mengdi Wang, Ahmad Beirami, Furong Huang, Alvaro Velasquez, Dinesh Manocha, Amrit Singh Bedi | Published: 2024-11-27 | Updated: 2025-03-20
Prompt Injection
Safety Alignment
Adversarial attack

“Moralized” Multi-Step Jailbreak Prompts: Black-Box Testing of Guardrails in Large Language Models for Verbal Attacks

Authors: Libo Wang | Published: 2024-11-23 | Updated: 2025-03-20
Prompt Injection
Large Language Model

JailbreakLens: Interpreting Jailbreak Mechanism in the Lens of Representation and Circuit

Authors: Zeqing He, Zhibo Wang, Zhixuan Chu, Huiyu Xu, Wenhui Zhang, Qinglong Wang, Rui Zheng | Published: 2024-11-17 | Updated: 2025-04-24
Prompt Injection
Large Language Model

MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue

Authors: Fengxiang Wang, Ranjie Duan, Peng Xiao, Xiaojun Jia, Shiji Zhao, Cheng Wei, YueFeng Chen, Chongwen Wang, Jialing Tao, Hang Su, Jun Zhu, Hui Xue | Published: 2024-11-06 | Updated: 2025-01-07
Prompt Injection
Multi-Round Dialogue

SQL Injection Jailbreak: A Structural Disaster of Large Language Models

Authors: Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu | Published: 2024-11-03 | Updated: 2025-05-21
Prompt Injection
Prompt leaking
Attack Type