バックドア攻撃

Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning

Authors: Baogang Song, Dongdong Zhao, Jianwen Xiang, Qiben Xu, Zizhuo Yu | Published: 2025-10-15
バックドア攻撃
モデル保護手法
防御メカニズム

Cryptographic Backdoor for Neural Networks: Boon and Bane

Authors: Anh Tu Ngo, Anupam Chattopadhyay, Subhamoy Maitra | Published: 2025-09-25
トリガーの検知
バックドア攻撃
透かし設計

Non-omniscient backdoor injection with a single poison sample: Proving the one-poison hypothesis for linear regression and linear classification

Authors: Thorsten Peinemann, Paula Arnold, Sebastian Berndt, Thomas Eisenbarth, Esfandiar Mohammadi | Published: 2025-08-07
バックドア攻撃
バックドア攻撃手法
ポイズニング

Evasion Attacks Against Bayesian Predictive Models

Authors: Pablo G. Arce, Roi Naveiro, David Ríos Insua | Published: 2025-06-11
バックドア攻撃
ベイジアン敵対的学習
敵対的摂動手法

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

Authors: Xuankun Rong, Wenke Huang, Jian Liang, Jinhe Bi, Xun Xiao, Yiming Li, Bo Du, Mang Ye | Published: 2025-05-22
LLMセキュリティ
バックドア攻撃

Finetuning-Activated Backdoors in LLMs

Authors: Thibaud Gloaguen, Mark Vero, Robin Staab, Martin Vechev | Published: 2025-05-22
LLMセキュリティ
バックドア攻撃
プロンプトインジェクション

Analysis of the vulnerability of machine learning regression models to adversarial attacks using data from 5G wireless networks

Authors: Leonid Legashev, Artur Zhigalov, Denis Parfenov | Published: 2025-05-01
バックドア攻撃
ポイズニング
攻撃タイプ

How to Backdoor the Knowledge Distillation

Authors: Chen Wu, Qian Ma, Prasenjit Mitra, Sencun Zhu | Published: 2025-04-30
バックドア攻撃
敵対的学習
知識蒸留の脆弱性

Detecting Instruction Fine-tuning Attacks on Language Models using Influence Function

Authors: Jiawei Li | Published: 2025-04-12 | Updated: 2025-09-30
バックドア攻撃
プロンプトの検証
感情分析

BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models

Authors: Zenghui Yuan, Jiawen Shi, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun | Published: 2025-03-20
バックドア攻撃
プロンプトインジェクション
大規模言語モデル