Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks Authors: Xin Yi, Yue Li, Linlin Wang, Xiaoling Wang, Liang He | Published: 2025-01-18 Prompt InjectionAdversarial TrainingExcessive Denial Mitigation 2025.01.18 2025.05.27 Literature Database