This page provides the attacks and factors that have a negative impact “AI output manipulated under specific conditions, leading to degradation of functionality or service quality” in the information systems aspect in the AI Security Map, the defense methods and countermeasures against them, as well as the relevant AI technologies, tasks, and data. It also indicates related elements in the external influence aspect.
Attack or cause
Defensive method or countermeasure
- Detection of triggers
- Detection of poisoned data for backdoor attack
- Detection of backdoor models
- Certified robustness
Targeted AI technology
- DNN
- CNN
- LLM
- Contrastive learning
- FSL
- GNN
- Federated learning
- LSTM
- RNN
Task
- Classification
- Generation
Data
- Image
- Graph
- Text
- Audio
Related external influence aspect
- Reputation
- Physical impact
- Psychological impact
- Financial impact
- Economy
- Critical infrastructure
- Medical care
References
Backdoor attack
- Targeted backdoor attacks on deep learning systems using data poisoning, 2017
- BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain, 2017
- Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses, 2020
- Hidden Trigger Backdoor Attacks, 2020
- Backdoor Attacks to Graph Neural Networks, 2021
- Graph Backdoor, 2021
- Can You Hear It? Backdoor Attack via Ultrasonic Triggers, 2021
- Backdoor Attacks Against Dataset Distillation, 2023
- Universal Jailbreak Backdoors from Poisoned Human Feedback, 2023
Detection of poisoned data for backdoor attack
Detection of backdoor models
- Neural Trojans, 2017
- Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks, 2018
- Detecting AI Trojans Using Meta Neural Analysis, 2021
- T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification, 2021
- Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning, 2024
- LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors, 2024
Certified robustness
- Certified Defenses for Data Poisoning Attacks, 2017
- Certified Robustness to Adversarial Examples with Differential Privacy, 2019
- On Evaluating Adversarial Robustness, 2019
- Certified Adversarial Robustness via Randomized Smoothing, 2019
- Certified Robustness of Graph Neural Networks against Adversarial Structural Perturbation, 2021
- Certified Robustness for Large Language Models with Self-Denoising, 2023
- RAB: Provable Robustness Against Backdoor Attacks, 2023
- (Certified!!) Adversarial Robustness for Free!, 2023
- Certifying LLM Safety against Adversarial Prompting, 2024