バックドア攻撃対策

BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models

Authors: Zihan Wang, Hongwei Li, Rui Zhang, Wenbo Jiang, Kangjie Chen, Tianwei Zhang, Qingchuan Zhao, Guowen Xu | Published: 2025-05-06
RAGへのポイズニング攻撃
バックドア攻撃対策
敵対的学習

Towards Probabilistic Verification of Machine Unlearning

Authors: David Marco Sommer, Liwei Song, Sameer Wagh, Prateek Mittal | Published: 2020-03-09 | Updated: 2020-12-01
トレーニング手法
バックドア攻撃
バックドア攻撃対策

Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering

Authors: Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, Biplav Srivastava | Published: 2018-11-09
バックドア攻撃対策
バックドア攻撃用の毒データの検知
ポイズニング攻撃

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Authors: Cong Liao, Haoti Zhong, Anna Squicciarini, Sencun Zhu, David Miller | Published: 2018-08-30
バックドア攻撃
バックドア攻撃対策
ロバスト性分析