SPEAR++: Scaling Gradient Inversion via Sparsely-Used Dictionary Learning Authors: Alexander Bakarsky, Dimitar I. Dimitrov, Maximilian Baader, Martin Vechev | Published: 2025-10-28 Impact of SparsityPrivacy ProtectionEffectiveness Analysis of Defense Methods 2025.10.28 2025.10.30 Literature Database
Untargeted Jailbreak Attack Authors: Xinzhe Huang, Wenjing Hu, Tianhang Zheng, Kedong Xiu, Xiaojun Jia, Di Wang, Zhan Qin, Kui Ren | Published: 2025-10-03 | Updated: 2025-10-28 Prompt InjectionPrompt leakingEffectiveness Analysis of Defense Methods 2025.10.03 2025.10.30 Literature Database
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks Authors: Guobin Shen, Dongcheng Zhao, Linghao Feng, Xiang He, Jihang Wang, Sicheng Shen, Haibo Tong, Yiting Dong, Jindong Li, Xiang Zheng, Yi Zeng | Published: 2025-05-20 | Updated: 2025-05-22 Disabling Safety Mechanisms of LLMPrompt InjectionEffectiveness Analysis of Defense Methods 2025.05.20 2025.05.28 Literature Database
FlowPure: Continuous Normalizing Flows for Adversarial Purification Authors: Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, Vera Rimmer | Published: 2025-05-19 Robustness Improvement MethodAdversarial LearningEffectiveness Analysis of Defense Methods 2025.05.19 2025.05.28 Literature Database
Secure Transfer Learning: Training Clean Models Against Backdoor in (Both) Pre-trained Encoders and Downstream Datasets Authors: Yechao Zhang, Yuxuan Zhou, Tianyu Li, Minghui Li, Shengshan Hu, Wei Luo, Leo Yu Zhang | Published: 2025-04-16 Backdoor DetectionImprovement of LearningEffectiveness Analysis of Defense Methods 2025.04.16 2025.05.27 Literature Database
STShield: Single-Token Sentinel for Real-Time Jailbreak Detection in Large Language Models Authors: Xunguang Wang, Wenxuan Wang, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Daoyuan Wu, Shuai Wang | Published: 2025-03-23 Prompt InjectionMalicious PromptEffectiveness Analysis of Defense Methods 2025.03.23 2025.05.27 Literature Database
Bias Busters: Robustifying DL-based Lithographic Hotspot Detectors Against Backdooring Attacks Authors: Kang Liu, Benjamin Tan, Gaurav Rajavendra Reddy, Siddharth Garg, Yiorgos Makris, Ramesh Karri | Published: 2020-04-26 PoisoningDeep Learning TechnologyEffectiveness Analysis of Defense Methods 2020.04.26 2025.05.28 Literature Database
Minimax Defense against Gradient-based Adversarial Attacks Authors: Blerta Lindqvist, Rauf Izmailov | Published: 2020-02-04 Adversarial Perturbation TechniquesAdversarial TransferabilityEffectiveness Analysis of Defense Methods 2020.02.04 2025.05.28 Literature Database
Defending Adversarial Attacks via Semantic Feature Manipulation Authors: Shuo Wang, Tianle Chen, Surya Nepal, Carsten Rudolph, Marthie Grobler, Shangyu Chen | Published: 2020-02-03 | Updated: 2020-04-22 Robustness Improvement MethodAdversarial ExampleEffectiveness Analysis of Defense Methods 2020.02.03 2025.05.28 Literature Database
Ensemble Noise Simulation to Handle Uncertainty about Gradient-based Adversarial Attacks Authors: Rehana Mahfuz, Rajeev Sahay, Aly El Gamal | Published: 2020-01-26 Adversarial LearningAdversarial Attack DetectionEffectiveness Analysis of Defense Methods 2020.01.26 2025.05.28 Literature Database