BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models Authors: Zihan Wang, Hongwei Li, Rui Zhang, Wenbo Jiang, Kangjie Chen, Tianwei Zhang, Qingchuan Zhao, Guowen Xu | Published: 2025-05-06 RAGへのポイズニング攻撃バックドア攻撃対策敵対的学習 2025.05.06 文献データベース
Bayesian Robust Aggregation for Federated Learning Authors: Aleksandr Karakulev, Usama Zafar, Salman Toor, Prashant Singh | Published: 2025-05-05 グループベースの堅牢性トリガーの検知敵対的学習 2025.05.05 文献データベース
How to Backdoor the Knowledge Distillation Authors: Chen Wu, Qian Ma, Prasenjit Mitra, Sencun Zhu | Published: 2025-04-30 バックドア攻撃敵対的学習知識蒸留の脆弱性 2025.04.30 文献データベース
GIFDL: Generated Image Fluctuation Distortion Learning for Enhancing Steganographic Security Authors: Xiangkun Wang, Kejiang Chen, Yuang Qi, Ruiheng Liu, Weiming Zhang, Nenghai Yu | Published: 2025-04-21 敵対的学習生成モデル透かし技術 2025.04.21 文献データベース
Stop Walking in Circles! Bailing Out Early in Projected Gradient Descent Authors: Philip Doldo, Derek Everett, Amol Khanna, Andre T Nguyen, Edward Raff | Published: 2025-03-25 敵対的サンプルの脆弱性敵対的学習深層ネットワークの堅牢性 2025.03.25 2025.04.03 文献データベース
TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions Authors: Wang YuHang, Junkang Guo, Aolei Liu, Kaihao Wang, Zaitong Wu, Zhenyu Liu, Wenfei Yin, Jian Liu | Published: 2025-03-02 | Updated: 2025-03-21 ロバスト性敵対的学習敵対的訓練 2025.03.02 2025.04.03 文献データベース
SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage Authors: Xiaoning Dong, Wenbo Hu, Wei Xu, Tianxing He | Published: 2024-12-19 | Updated: 2025-03-21 プロンプトインジェクション大規模言語モデル敵対的学習 2024.12.19 2025.04.03 文献データベース
Protecting Confidentiality, Privacy and Integrity in Collaborative Learning Authors: Dong Chen, Alice Dethise, Istemi Ekin Akkus, Ivica Rimac, Klaus Satzke, Antti Koskela, Marco Canini, Wei Wang, Ruichuan Chen | Published: 2024-12-11 | Updated: 2025-04-17 プライバシー保護フレームワーク差分プライバシー敵対的学習 2024.12.11 文献データベース
Robust LLM safeguarding via refusal feature adversarial training Authors: Lei Yu, Virginie Do, Karen Hambardzumyan, Nicola Cancedda | Published: 2024-09-30 | Updated: 2025-03-20 プロンプトインジェクションモデルの堅牢性敵対的学習 2024.09.30 2025.04.03 文献データベース
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples Authors: Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, Yu-Yang Liu, Li Yuan | Published: 2023-10-02 | Updated: 2024-08-04 ハルシネーション敵対的サンプルの脆弱性敵対的学習 2023.10.02 2025.04.03 文献データベース