バックドア攻撃

ASPIRER: Bypassing System Prompts With Permutation-based Backdoors in LLMs

Authors: Lu Yan, Siyuan Cheng, Xuan Chen, Kaiyuan Zhang, Guangyu Shen, Zhuo Zhang, Xiangyu Zhang | Published: 2024-10-05
Negative Training
バックドア攻撃
プロンプトインジェクション

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents

Authors: Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang | Published: 2024-10-03
バックドア攻撃
プロンプトインジェクション

Empirical Perturbation Analysis of Linear System Solvers from a Data Poisoning Perspective

Authors: Yixin Liu, Arielle Carr, Lichao Sun | Published: 2024-10-01
バックドア攻撃
ポイズニング
線形ソルバー

Timber! Poisoning Decision Trees

Authors: Stefano Calzavara, Lorenzo Cazzaro, Massimo Vettori | Published: 2024-10-01
バックドア攻撃
ポイズニング

Weak-to-Strong Backdoor Attack for Large Language Models

Authors: Shuai Zhao, Leilei Gan, Zhongliang Guo, Xiaobao Wu, Luwei Xiao, Xiaoyu Xu, Cong-Duy Nguyen, Luu Anh Tuan | Published: 2024-09-26 | Updated: 2024-10-13
バックドア攻撃
プロンプトインジェクション

SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning

Authors: Minyeong Choe, Cheolhee Park, Changho Seo, Hyunil Kim | Published: 2024-09-23 | Updated: 2025-07-30
バックドア攻撃
ポイズニング
透かしの耐久性

Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm

Authors: Jaehan Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin | Published: 2024-09-21 | Updated: 2024-10-06
バックドア攻撃
モデル性能評価
防御手法

SoK: Security and Privacy Risks of Medical AI

Authors: Yuanhaur Chang, Han Liu, Evin Jaff, Chenyang Lu, Ning Zhang | Published: 2024-09-11
バックドア攻撃
プライバシー保護
医療AIの脅威

2DSig-Detect: a semi-supervised framework for anomaly detection on image data using 2D-signatures

Authors: Xinheng Xie, Kureha Yamaguchi, Margaux Leblanc, Simon Malzard, Varun Chhabra, Victoria Nockles, Yue Wu | Published: 2024-09-08 | Updated: 2025-03-20
バックドア攻撃
ポイズニング
評価手法

Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm?

Authors: Rui Wen, Michael Backes, Yang Zhang | Published: 2024-09-05
バックドア攻撃
プライバシー保護手法
メンバーシップ推論