AIセキュリティマップにマッピングされた情報システム的側面における負の影響「特定の条件下でAIの出力を操作」をもたらす攻撃・要因、それに対する防御手法・対策、および対象のAI技術・タスク・データを示しています。また、関連する外部作用的側面の要素も示しています。
攻撃・要因
防御手法・対策
対象のAI技術
- DNN
- CNN
- LLM
- 対照学習
- FSL
- GNN
- 連合学習
- LSTM
- RNN
タスク
- 分類
- 生成
対象のデータ
- 画像
- グラフ
- テキスト
- 音声
関連する外部作用的側面
参考文献
バックドア攻撃
- Targeted backdoor attacks on deep learning systems using data poisoning, 2017
- BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain, 2017
- Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses, 2020
- Hidden Trigger Backdoor Attacks, 2020
- Backdoor Attacks to Graph Neural Networks, 2021
- Graph Backdoor, 2021
- Can You Hear It? Backdoor Attack via Ultrasonic Triggers, 2021
- Backdoor Attacks Against Dataset Distillation, 2023
- Universal Jailbreak Backdoors from Poisoned Human Feedback, 2023
トリガーの検知
バックドア攻撃用の毒データの検知
バックドアモデルの検知
- Neural Trojans, 2017
- Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks, 2018
- Detecting AI Trojans Using Meta Neural Analysis, 2021
- T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification, 2021
- Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning, 2024
- LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors, 2024
モデルの頑健性保証
- Explaining and Harnessing Adversarial Examples, 2015
- Towards Deep Neural Network Architectures Robust to Adversarial Examples, 2015
- Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, 2016
- Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks, 2017
- Towards Deep Learning Models Resistant to Adversarial Attacks, 2018
- Ensemble Adversarial Training: Attacks and Defenses, 2018
- Provable defenses against adversarial examples via the convex outer adversarial polytope, 2018
- On Evaluating Adversarial Robustness, 2019
- Evaluating Robustness of Neural Networks with Mixed Integer Programming, 2019