Large Language Model

Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate

Authors: Senmao Qi, Yifei Zou, Peng Li, Ziyi Lin, Xiuzhen Cheng, Dongxiao Yu | Published: 2025-04-23

Indirect Prompt Injection

Multi-Round Dialogue

Large Language Model

2025.04.23 2025.05.27

Literature Database

Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey

Authors: Shuang Tian, Tao Zhang, Jiqiang Liu, Jiacheng Wang, Xuangou Wu, Xiaoqiang Zhu, Ruichen Zhang, Weiting Zhang, Zhenhui Yuan, Shiwen Mao, Dong In Kim | Published: 2025-04-22 | Updated: 2025-04-28

Indirect Prompt Injection

Prompt Injection

Large Language Model

2025.04.22 2025.05.27

Literature Database

CTI-HAL: A Human-Annotated Dataset for Cyber Threat Intelligence Analysis

Authors: Sofia Della Penna, Roberto Natella, Vittorio Orbinato, Lorenzo Parracino, Luciano Pianese | Published: 2025-04-08

LLM Application

Model Performance Evaluation

Large Language Model

2025.04.08 2025.05.27

Literature Database

Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking

Authors: Yu-Hang Wu, Yu-Jie Xiong, Jie-Zhang | Published: 2025-04-08

LLM Application

Prompt Injection

Large Language Model

2025.04.08 2025.05.27

Literature Database

PiCo: Jailbreaking Multimodal Large Language Models via $\textbf{Pi}$ctorial $\textbf{Co}$de Contextualization

Authors: Aofan Liu, Lulu Tang, Ting Pan, Yuguo Yin, Bin Wang, Ao Yang | Published: 2025-04-02 | Updated: 2025-04-07

Model Performance Evaluation

Large Language Model

Watermark

2025.04.02 2025.05.27

Literature Database

Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing

Authors: Johan Wahréus, Ahmed Hussain, Panos Papadimitratos | Published: 2025-03-27

System Development

Prompt Injection

Large Language Model

2025.03.27 2025.05.27

Literature Database

BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models

Authors: Zenghui Yuan, Jiawen Shi, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun | Published: 2025-03-20

Backdoor Attack

Prompt Injection

Large Language Model

2025.03.20 2025.05.27

Literature Database

Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings

Authors: Zonghao Ying, Guangyi Zheng, Yongxin Huang, Deyue Zhang, Wenxin Zhang, Quanchen Zou, Aishan Liu, Xianglong Liu, Dacheng Tao | Published: 2025-03-19

Prompt Injection

Large Language Model

Attack Method

2025.03.19 2025.05.27

Literature Database

MirrorShield: Towards Universal Defense Against Jailbreaks via Entropy-Guided Mirror Crafting

Authors: Rui Pu, Chaozhuo Li, Rui Ha, Litian Zhang, Lirong Qiu, Xi Zhang | Published: 2025-03-17 | Updated: 2025-05-20

Prompt Injection

Large Language Model

Attack Method

2025.03.17 2025.05.27

Literature Database

Probabilistic Modeling of Jailbreak on Multimodal LLMs: From Quantification to Application

Authors: Wenzhuo Xu, Zhipeng Wei, Xiongtao Sun, Zonghao Ying, Deyue Zhang, Dongdong Yang, Xiangzheng Zhang, Quanchen Zou | Published: 2025-03-10 | Updated: 2025-07-31

Prompt Injection

Large Language Model

Robustness of Watermarking Techniques

2025.03.10 2025.08.02

Literature Database