Attack Method

ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models

Authors: Jin Xie, Ruishi He, Songze Li, Xiaojun Jia, Shouling Ji | Published: 2025-04-29

Backdoor Detection

Privacy Violation

Attack Method

2025.04.29 2025.05.27

Literature Database

Token-Efficient Prompt Injection Attack: Provoking Cessation in LLM Reasoning via Adaptive Token Compression

Authors: Yu Cui, Yujun Cai, Yiwei Wang | Published: 2025-04-29

Token Compression Framework

Prompt Injection

Attack Method

2025.04.29 2025.05.27

Literature Database

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yue Liu, Yufei He, Yangqiu Song, Bryan Hooi | Published: 2025-04-29

Indirect Prompt Injection

Prompt validation

Attack Method

2025.04.29 2025.05.27

Literature Database

Enhancing Leakage Attacks on Searchable Symmetric Encryption Using LLM-Based Synthetic Data Generation

Authors: Joshua Chiu, Partha Protim Paul, Zahin Wahab | Published: 2025-04-29

Indirect Prompt Injection

Attack Method

Hierarchical Clustering

2025.04.29 2025.05.27

Literature Database

The Automation Advantage in AI Red Teaming

Authors: Rob Mulla, Ads Dawson, Vincent Abruzzon, Brian Greunke, Nick Landers, Brad Palm, Will Pearce | Published: 2025-04-28 | Updated: 2025-04-29

Prompt leaking

Attack Method

Effects of Automation

2025.04.28 2025.05.27

Literature Database

BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts

Authors: Qingyue Wang, Qi Pang, Xixun Lin, Shuai Wang, Daoyuan Wu | Published: 2025-04-24 | Updated: 2025-04-29

Poisoning attack on RAG

Backdoor Attack Techniques

Attack Method

2025.04.24 2025.05.27

Literature Database

NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect

Authors: Yicheng Zhang, Ravan Nazaraliyev, Sankha Baran Dutta, Andres Marquez, Kevin Barker, Nael Abu-Ghazaleh | Published: 2025-03-22

Cloud Computing

Side-Channel Attack

Attack Method

2025.03.22 2025.05.27

Literature Database

Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings

Authors: Zonghao Ying, Guangyi Zheng, Yongxin Huang, Deyue Zhang, Wenxin Zhang, Quanchen Zou, Aishan Liu, Xianglong Liu, Dacheng Tao | Published: 2025-03-19

Prompt Injection

Large Language Model

Attack Method

2025.03.19 2025.05.27

Literature Database

Temporal Context Awareness: A Defense Framework Against Multi-turn Manipulation Attacks on Large Language Models

Authors: Prashant Kulkarni, Assaf Namer | Published: 2025-03-18

Prompt Injection

Prompt leaking

Attack Method

2025.03.18 2025.05.27

Literature Database

Personalized Attacks of Social Engineering in Multi-turn Conversations — LLM Agents for Simulation and Detection

Authors: Tharindu Kumarage, Cameron Johnson, Jadie Adams, Lin Ai, Matthias Kirchner, Anthony Hoogs, Joshua Garland, Julia Hirschberg, Arslan Basharat, Huan Liu | Published: 2025-03-18

Alignment

Social Engineering Attack

Attack Method

2025.03.18 2025.05.27

Literature Database