This page provides the attacks and factors that have a negative impact “Difficulty in determining the reliability of AI output” in the information systems aspect in the AI Security Map, the defense methods and countermeasures against them, as well as the relevant AI technologies, tasks, and data. It also indicates related elements in the external influence aspect.
Attack or cause
- Hallucination
- Integrity violation
- Explainability violation
Defensive method or countermeasure
Targeted AI technology
- All AI technologies
Task
- Classification
- Generation
Data
- Image
- Graph
- Text
- Audio
Related external influence aspect
References
Hallucination
- The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”, 2023
- Why Does ChatGPT Fall Short in Providing Truthful Answers?, 2023
- DefAn: Definitive-Answer-Dataset-for-LLMs-Hallucination-Evaluation, 2024
- LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples, 2024
- The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models, 2024
Quantification of uncertainty
- Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding, 2015
- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, 2016
- Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, 2017
- Predictive Uncertainty Estimation via Prior Networks, 2018
- Evidential Deep Learning to Quantify Classification Uncertainty, 2018
- Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, 2019
- Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods, 2021
RAG
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, 2020
- REALM: Retrieval-Augmented Language Model Pre-Training, 2020
- In-Context Retrieval-Augmented Language Models, 2023
- Active Retrieval Augmented Generation, 2023
- Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection, 2023
- Query Rewriting for Retrieval-Augmented Large Language Models, 2023
- Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering, 2023
- Generate rather than Retrieve: Large Language Models are Strong Context Generators, 2023
- Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy, 2023
- From Local to Global: A Graph RAG Approach to Query-Focused Summarization, 2024
Search for other references related to “RAG” in the literature database
XAI (Explainable AI)
- Visualizing and Understanding Convolutional Networks, 2014
- Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, 2014
- Understanding Deep Image Representations by Inverting Them, 2014
- “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, 2016
- A Unified Approach to Interpreting Model Predictions, 2017
- Learning Important Features Through Propagating Activation Differences, 2017
- Understanding Black-box Predictions via Influence Functions, 2017
- Interpretable Explanations of Black Boxes by Meaningful Perturbation, 2017
- Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), 2018
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, 2019
Detection of hallucinations
- Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis, 2023
- Cost-Effective Hallucination Detection for LLMs, 2024
- The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models, 2024
- Measuring and Reducing LLM Hallucination without Gold-Standard Answers, 2024
- On Large Language Models’ Hallucination with Regard to Known Facts, 2024