TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions Authors: Wang YuHang, Junkang Guo, Aolei Liu, Kaihao Wang, Zaitong Wu, Zhenyu Liu, Wenfei Yin, Jian Liu | Published: 2025-03-02 | Updated: 2025-03-21 ロバスト性敵対的学習敵対的訓練 2025.03.02 2025.04.03 文献データベース
“Short-length” Adversarial Training Helps LLMs Defend “Long-length” Jailbreak Attacks: Theoretical and Empirical Evidence Authors: Shaopeng Fu, Liang Ding, Di Wang | Published: 2025-02-06 プロンプトインジェクション大規模言語モデル敵対的訓練 2025.02.06 2025.04.03 文献データベース
Adversarial Robustness in Two-Stage Learning-to-Defer: Algorithms and Guarantees Authors: Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi | Published: 2025-02-03 Learning-to-Defer敵対的サンプル敵対的訓練 2025.02.03 2025.04.03 文献データベース
Smoothed Embeddings for Robust Language Models Authors: Ryo Hase, Md Rafi Ur Rashid, Ashley Lewis, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang | Published: 2025-01-27 プロンプトインジェクションメンバーシップ推論敵対的訓練 2025.01.27 2025.04.03 文献データベース
Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks Authors: Xin Yi, Yue Li, Linlin Wang, Xiaoling Wang, Liang He | Published: 2025-01-18 プロンプトインジェクション敵対的訓練過剰拒否緩和 2025.01.18 2025.04.03 文献データベース
Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness Authors: Olukorede Fakorede, Modeste Atsague, Jin Tian | Published: 2024-12-27 敵対的サンプル敵対的訓練 2024.12.27 2025.04.03 文献データベース
GLL: A Differentiable Graph Learning Layer for Neural Networks Authors: Jason Brown, Bohan Chen, Harris Hardiman-Mostow, Jeff Calder, Andrea L. Bertozzi | Published: 2024-12-11 ポイズニング敵対的訓練 2024.12.11 2025.04.03 文献データベース
On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe | Published: 2024-10-21 収束分析敵対的訓練 2024.10.21 2025.04.03 文献データベース
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings Authors: Hossein Mirzaei, Mackenzie W. Mathis | Published: 2024-10-14 | Updated: 2025-01-26 メンバーシップ推論敵対的訓練 2024.10.14 2025.04.03 文献データベース
Towards Calibrated Losses for Adversarial Robust Reject Option Classification Authors: Vrund Shah, Tejas Chaudhari, Naresh Manwani | Published: 2024-10-14 敵対的訓練 2024.10.14 2025.04.03 文献データベース