LLM Security

AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs

Authors: Lijia Lv, Weigang Zhang, Xuehai Tang, Jie Wen, Feng Liu, Jizhong Han, Songlin Hu | Published: 2024-09-11
LLM Security
Prompt Injection
Attack Method

Well, that escalated quickly: The Single-Turn Crescendo Attack (STCA)

Authors: Alan Aqrawi, Arian Abbasi | Published: 2024-09-04 | Updated: 2024-09-10
LLM Security
Content Moderation
Attack Method

Unveiling the Vulnerability of Private Fine-Tuning in Split-Based Frameworks for Large Language Models: A Bidirectionally Enhanced Attack

Authors: Guanzhong Chen, Zhenghan Qin, Mingxin Yang, Yajie Zhou, Tao Fan, Tianyu Du, Zenglin Xu | Published: 2024-09-02 | Updated: 2024-09-04
LLM Security
Prompt Injection
Attack Method

Enhancing Source Code Security with LLMs: Demystifying The Challenges and Generating Reliable Repairs

Authors: Nafis Tanveer Islam, Joseph Khoury, Andrew Seong, Elias Bou-Harb, Peyman Najafirad | Published: 2024-09-01
LLM Security
Vulnerability Management
Automated Vulnerability Remediation

LLM-PBE: Assessing Data Privacy in Large Language Models

Authors: Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song | Published: 2024-08-23 | Updated: 2024-09-06
LLM Security
Privacy Protection Method
Prompt Injection

EEG-Defender: Defending against Jailbreak through Early Exit Generation of Large Language Models

Authors: Chongwen Zhao, Zhihao Dou, Kaizhu Huang | Published: 2024-08-21
LLM Security
Prompt Injection
Defense Method

Security Attacks on LLM-based Code Completion Tools

Authors: Wen Cheng, Ke Sun, Xinyu Zhang, Wei Wang | Published: 2024-08-20 | Updated: 2025-01-02
LLM Security
Prompt Injection
Attack Method

Transferring Backdoors between Large Language Models by Knowledge Distillation

Authors: Pengzhou Cheng, Zongru Wu, Tianjie Ju, Wei Du, Zhuosheng Zhang Gongshen Liu | Published: 2024-08-19
LLM Security
Backdoor Attack
Poisoning

Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning

Authors: Tiansheng Huang, Gautam Bhattacharya, Pratik Joshi, Josh Kimball, Ling Liu | Published: 2024-08-18 | Updated: 2024-09-03
LLM Security
Prompt Injection
Safety Alignment

BaThe: Defense against the Jailbreak Attack in Multimodal Large Language Models by Treating Harmful Instruction as Backdoor Trigger

Authors: Yulin Chen, Haoran Li, Yirui Zhang, Zihao Zheng, Yangqiu Song, Bryan Hooi | Published: 2024-08-17 | Updated: 2025-04-22
AI Compliance
LLM Security
Content Moderation