Prompt Injection

GraphAttack: Exploiting Representational Blindspots in LLM Safety Mechanisms

Authors: Sinan He, An Wang | Published: 2025-04-17
Alignment
Prompt Injection
Vulnerability Research

The Digital Cybersecurity Expert: How Far Have We Come?

Authors: Dawei Wang, Geng Zhou, Xianglong Li, Yu Bai, Li Chen, Ting Qin, Jian Sun, Dan Li | Published: 2025-04-16
LLM Performance Evaluation
Poisoning attack on RAG
Prompt Injection

Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails

Authors: William Hackett, Lewis Birch, Stefan Trawicki, Neeraj Suri, Peter Garraghan | Published: 2025-04-15 | Updated: 2025-04-16
LLM Performance Evaluation
Prompt Injection
Adversarial Attack Analysis

Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?

Authors: Yanbo Wang, Jiyang Guan, Jian Liang, Ran He | Published: 2025-04-14
Prompt Injection
Bias in Training Data
Safety Alignment

An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection

Authors: Qiyao Tang, Xiangyang Li | Published: 2025-04-14
LLM Performance Evaluation
Prompt Injection
Model DoS

CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent

Authors: Liang-bo Ning, Shijie Wang, Wenqi Fan, Qing Li, Xin Xu, Hao Chen, Feiran Huang | Published: 2025-04-13 | Updated: 2025-04-24
Indirect Prompt Injection
Prompt Injection
Attacker Behavior Analysis

Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking

Authors: Yu-Hang Wu, Yu-Jie Xiong, Jie-Zhang | Published: 2025-04-08
LLM Application
Prompt Injection
Large Language Model

Generative Large Language Model usage in Smart Contract Vulnerability Detection

Authors: Peter Ince, Jiangshan Yu, Joseph K. Liu, Xiaoning Du | Published: 2025-04-07
Prompt Injection
Prompt leaking
Vulnerability Analysis

Representation Bending for Large Language Model Safety

Authors: Ashkan Yousefpour, Taeheon Kim, Ryan S. Kwon, Seungbeen Lee, Wonje Jeung, Seungju Han, Alvin Wan, Harrison Ngan, Youngjae Yu, Jonghyun Choi | Published: 2025-04-02
Prompt Injection
Prompt leaking
Safety Alignment

LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution

Authors: Zhuoran Yang, Jie Peng, Zhen Tan, Tianlong Chen, Yanyong Zhang | Published: 2025-04-02
Prompt Injection
Model Performance Evaluation
Uncertainty Measurement