TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions Authors: Wang YuHang, Junkang Guo, Aolei Liu, Kaihao Wang, Zaitong Wu, Zhenyu Liu, Wenfei Yin, Jian Liu | Published: 2025-03-02 | Updated: 2025-03-21 RobustnessAdversarial LearningAdversarial Training 2025.03.02 2025.05.27 Literature Database
“Short-length” Adversarial Training Helps LLMs Defend “Long-length” Jailbreak Attacks: Theoretical and Empirical Evidence Authors: Shaopeng Fu, Liang Ding, Di Wang | Published: 2025-02-06 Prompt InjectionLarge Language ModelAdversarial Training 2025.02.06 2025.05.27 Literature Database
Adversarial Robustness in Two-Stage Learning-to-Defer: Algorithms and Guarantees Authors: Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi | Published: 2025-02-03 Learning-to-DeferAdversarial ExampleAdversarial Training 2025.02.03 2025.05.27 Literature Database
Smoothed Embeddings for Robust Language Models Authors: Ryo Hase, Md Rafi Ur Rashid, Ashley Lewis, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang | Published: 2025-01-27 Prompt InjectionMembership InferenceAdversarial Training 2025.01.27 2025.05.27 Literature Database
Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks Authors: Xin Yi, Yue Li, Linlin Wang, Xiaoling Wang, Liang He | Published: 2025-01-18 Prompt InjectionAdversarial TrainingExcessive Denial Mitigation 2025.01.18 2025.05.27 Literature Database
Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness Authors: Olukorede Fakorede, Modeste Atsague, Jin Tian | Published: 2024-12-27 Adversarial ExampleAdversarial Training 2024.12.27 2025.05.27 Literature Database
GLL: A Differentiable Graph Learning Layer for Neural Networks Authors: Jason Brown, Bohan Chen, Harris Hardiman-Mostow, Jeff Calder, Andrea L. Bertozzi | Published: 2024-12-11 PoisoningAdversarial Training 2024.12.11 2025.05.27 Literature Database
On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe | Published: 2024-10-21 Convergence AnalysisAdversarial Training 2024.10.21 2025.05.27 Literature Database
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings Authors: Hossein Mirzaei, Mackenzie W. Mathis | Published: 2024-10-14 | Updated: 2025-01-26 Membership InferenceAdversarial Training 2024.10.14 2025.05.27 Literature Database
Towards Calibrated Losses for Adversarial Robust Reject Option Classification Authors: Vrund Shah, Tejas Chaudhari, Naresh Manwani | Published: 2024-10-14 Adversarial Training 2024.10.14 2025.05.27 Literature Database