AIセキュリティマップにマッピングされた情報システム的側面における負の影響「AIが誤分類を引き起こし、機能やサービスの質が低下」をもたらす攻撃・要因、それに対する防御手法・対策、および対象のAI技術・タスク・データを示しています。また、関連する外部作用的側面の要素も示しています。
攻撃・要因
防御手法・対策
AIシステムの開発フェーズにおける防御手法
1. データ収集・前処理
2. モデルの選定・学習・検証
- 敵対的学習
- モデルの頑健性保証
- モデルの安全性評価
3. システムの実装
4. システムの提供・運用・保守
- 敵対的サンプルの検知
5. システムの利用
対象のAI技術
- DNN
- CNN
- LLM
- 対照学習
- FSL
- GNN
- 連合学習
- LSTM
- RNN
タスク
- 分類
対象のデータ
- 画像
- グラフ
- テキスト
- 音声
関連する外部作用的側面
参考文献
敵対的サンプル
- Intriguing properties of neural networks, 2014
- Explaining and Harnessing Adversarial Examples, 2015
- The limitations of deep learning in adversarial settings, 2015
- Adversarial Examples in the Physical World, 2017
- Towards Evaluating the Robustness of Neural Networks, 2017
- Towards Deep Learning Models Resistant to Adversarial Attacks, 2018
- A Closer Look at Deep Learning Heuristics: Learning Rate Restarts, Warmup and Decay, 2020
敵対的学習
- Intriguing properties of neural networks, 2013.0
- Explaining and Harnessing Adversarial Examples, 2014.0
- Learning with a Strong Adversary, 2015.0
- Adversarial Examples: Attacks and Defenses for Deep Learning, 2017.0
- Towards Deep Learning Models Resistant to Adversarial Attacks, 2018.0
- Adversarial Training for Free!, 2019.0
- Adversarial Robustness Against the Union of Multiple Perturbation Models, 2019.0
- Bag of Tricks for Adversarial Training, 2020.0
- Smooth Adversarial Training, 2020.0
敵対的サンプルの検知
- Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics, 2017.0
- On the (Statistical) Detection of Adversarial Examples, 2017.0
- On Detecting Adversarial Perturbations, 2017.0
- MagNet: a Two-Pronged Defense against Adversarial Examples, 2017.0
- Detecting Adversarial Image Examples in Deep Networks with Adaptive Noise Reduction, 2021.0
- Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain, 2021.0
- Adversarial Example Detection for DNN Models: A Review and Experimental Comparison, 2022.0
- Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them, 2022.0
モデルの頑健性保証
- Certified Defenses for Data Poisoning Attacks, 2017.0
- Certified Robustness to Adversarial Examples with Differential Privacy, 2019.0
- On Evaluating Adversarial Robustness, 2019.0
- Certified Adversarial Robustness via Randomized Smoothing, 2019.0
- Certified Robustness of Graph Neural Networks against Adversarial Structural Perturbation, 2021.0
- Certified Robustness for Large Language Models with Self-Denoising, 2023.0
- RAB: Provable Robustness Against Backdoor Attacks, 2023.0
- (Certified!!) Adversarial Robustness for Free!, 2023.0
- Certifying LLM Safety against Adversarial Prompting, 2024.0
