This page provides the security targets of negative impacts “Outputting misinformation by AI” in the external influence aspect in the AI Security Map, as well as the attacks and factors that cause them, and the corresponding defense methods and countermeasures.
Security target
- Consumer
Attack or cause
- Integrity violation
- Degradation of accuracy
- Degradation of controllability
- Explainability violation
- Reliability violation
- Poisoning attack against RAG
- Hallucination
Defensive method or countermeasure
- Defensive method for integrity
- Data curation
- RAG
- XAI (Explainable AI)
- Detection of hallucination
References
Poisoning attack against RAG
- Poisoning Retrieval Corpora by Injecting Adversarial Passages, 2023
- BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models, 2024
- PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models, 2024
- Human-Imperceptible Retrieval Poisoning Attacks in LLM-Powered Applications, 2024
- Poison-RAG: Adversarial Data Poisoning Attacks on Retrieval-Augmented Generation in Recommender Systems, 2025
Hallucination
- The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”, 2023
- Why Does ChatGPT Fall Short in Providing Truthful Answers?, 2023
- DefAn: Definitive-Answer-Dataset-for-LLMs-Hallucination-Evaluation, 2024
- LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples, 2024
- The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models, 2024
Data curation
RAG
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, 2020
- REALM: Retrieval-Augmented Language Model Pre-Training, 2020
- In-Context Retrieval-Augmented Language Models, 2023
- Active Retrieval Augmented Generation, 2023
- Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection, 2023
- Query Rewriting for Retrieval-Augmented Large Language Models, 2023
- Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering, 2023
- Generate rather than Retrieve: Large Language Models are Strong Context Generators, 2023
- Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy, 2023
- From Local to Global: A Graph RAG Approach to Query-Focused Summarization, 2024
Search for other references related to “RAG” in the literature database
XAI (Explainable AI)
- Visualizing and Understanding Convolutional Networks, 2014
- Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, 2014
- Understanding Deep Image Representations by Inverting Them, 2014
- “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, 2016
- A Unified Approach to Interpreting Model Predictions, 2017
- Learning Important Features Through Propagating Activation Differences, 2017
- Understanding Black-box Predictions via Influence Functions, 2017
- Interpretable Explanations of Black Boxes by Meaningful Perturbation, 2017
- Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), 2018
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, 2019