Defense Method

DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing

Authors: Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang | Published: 2025-02-17 | Updated: 2025-05-29
LLM Security
Prompt Injection
Defense Method

BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Authors: Chia-Yi Hsu, Yu-Lin Tsai, Yu Zhe, Yan-Lun Chen, Chih-Hsun Lin, Chia-Mu Yu, Yang Zhang, Chun-Ying Huang, Jun Sakuma | Published: 2025-01-04
Backdoor Attack
Defense Method

Safeguarding System Prompts for LLMs

Authors: Zhifeng Jiang, Zhihua Jin, Guoliang He | Published: 2024-12-18 | Updated: 2025-01-09
LLM Performance Evaluation
Prompt Injection
Defense Method

Optimal Defenses Against Gradient Reconstruction Attacks

Authors: Yuxiao Chen, Gamze Gürsoy, Qi Lei | Published: 2024-11-06
Poisoning
Defense Method

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, Axel-Cyrille Ngonga Ngomo | Published: 2024-10-28
Membership Inference
Defense Method

Time Traveling to Defend Against Adversarial Example Attacks in Image Classification

Authors: Anthony Etim, Jakub Szefer | Published: 2024-10-10
Attack Method
Adversarial Example
Defense Method

Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems

Authors: Donghyun Lee, Mo Tiwari | Published: 2024-10-09
Prompt Injection
Attack Method
Defense Method

SecAlign: Defending Against Prompt Injection with Preference Optimization

Authors: Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, Chuan Guo | Published: 2024-10-07 | Updated: 2025-01-13
LLM Security
Prompt Injection
Defense Method

SoK: Towards Security and Safety of Edge AI

Authors: Tatjana Wingarz, Anne Lauscher, Janick Edinger, Dominik Kaaser, Stefan Schulte, Mathias Fischer | Published: 2024-10-07
Bias
Privacy Protection
Defense Method

Robustness Reprogramming for Representation Learning

Authors: Zhichao Hou, MohamadAli Torkamani, Hamid Krim, Xiaorui Liu | Published: 2024-10-06
Attack Evaluation
Defense Method