Adversarial Learning

Mitigating Fine-tuning Risks in LLMs via Safety-Aware Probing Optimization

Authors: Chengcan Wu, Zhixin Zhang, Zeming Wei, Yihao Zhang, Meng Sun | Published: 2025-05-22
LLM Security
Alignment
Adversarial Learning

SuperPure: Efficient Purification of Localized and Distributed Adversarial Patches via Super-Resolution GAN Models

Authors: Hossein Khalili, Seongbin Park, Venkat Bollapragada, Nader Sehatbakhsh | Published: 2025-05-22
Adversarial Learning
Computational Complexity
Defense Mechanism

Adversarially Pretrained Transformers may be Universally Robust In-Context Learners

Authors: Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki | Published: 2025-05-20
Certified Robustness
Relationship between Robustness and Privacy
Adversarial Learning

FlowPure: Continuous Normalizing Flows for Adversarial Purification

Authors: Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, Vera Rimmer | Published: 2025-05-19
Robustness Improvement Method
Adversarial Learning
Effectiveness Analysis of Defense Methods

Evaluating the Robustness of Adversarial Defenses in Malware Detection Systems

Authors: Mostafa Jafari, Alireza Shameli-Sendi | Published: 2025-05-14
Robustness Analysis
Attack Detection Method
Adversarial Learning

BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models

Authors: Zihan Wang, Hongwei Li, Rui Zhang, Wenbo Jiang, Kangjie Chen, Tianwei Zhang, Qingchuan Zhao, Guowen Xu | Published: 2025-05-06
Poisoning attack on RAG
Backdoor Attack Mitigation
Adversarial Learning

Bayesian Robust Aggregation for Federated Learning

Authors: Aleksandr Karakulev, Usama Zafar, Salman Toor, Prashant Singh | Published: 2025-05-05
Group-Based Robustness
Trigger Detection
Adversarial Learning

How to Backdoor the Knowledge Distillation

Authors: Chen Wu, Qian Ma, Prasenjit Mitra, Sencun Zhu | Published: 2025-04-30
Backdoor Attack
Adversarial Learning
Vulnerabilities of Knowledge Distillation

GIFDL: Generated Image Fluctuation Distortion Learning for Enhancing Steganographic Security

Authors: Xiangkun Wang, Kejiang Chen, Yuang Qi, Ruiheng Liu, Weiming Zhang, Nenghai Yu | Published: 2025-04-21
Adversarial Learning
Generative Model
Watermarking Technology

Stop Walking in Circles! Bailing Out Early in Projected Gradient Descent

Authors: Philip Doldo, Derek Everett, Amol Khanna, Andre T Nguyen, Edward Raff | Published: 2025-03-25
Vulnerability of Adversarial Examples
Adversarial Learning
Robustness of Deep Networks