Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Authors: Shuai Zhao, Xiaobao Wu, Cong-Duy Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Luu Anh Tuan | Published: 2024-10-18 | Updated: 2025-05-20 バックドアモデルの検知バックドア攻撃手法知識蒸留 2024.10.18 文献データベース
Knowledge Distillation with Adversarial Samples Supporting Decision Boundary Authors: Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi | Published: 2018-05-15 | Updated: 2018-12-14 敵対的サンプル敵対的攻撃検出知識蒸留 2018.05.15 2025.04.03 文献データベース