Using LLMs for Security Advisory Investigations: How Far Are We? Authors: Bayu Fedra Abdullah, Yusuf Sulistyo Nugroho, Brittany Reid, Raula Gaikovina Kula, Kazumasa Shimari, Kenichi Matsumoto | Published: 2025-06-16 Advice ProvisionHallucinationPrompt leaking 2025.06.16 2025.06.18 Literature Database
DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response Authors: Bilel Cherif, Tamas Bisztray, Richard A. Dubniczky, Aaesha Aldahmani, Saeed Alshehhi, Norbert Tihanyi | Published: 2025-05-26 HallucinationModel Performance EvaluationEvaluation Method 2025.05.26 2025.05.28 Literature Database
VADER: A Human-Evaluated Benchmark for Vulnerability Assessment, Detection, Explanation, and Remediation Authors: Ethan TS. Liu, Austin Wang, Spencer Mateega, Carlos Georgescu, Danny Tang | Published: 2025-05-26 Website VulnerabilityHallucinationDynamic Vulnerability Management 2025.05.26 2025.05.28 Literature Database
Phare: A Safety Probe for Large Language Models Authors: Pierre Le Jeune, Benoît Malézieux, Weixuan Xiao, Matteo Dora | Published: 2025-05-16 | Updated: 2025-05-19 RAGBias Mitigation TechniquesHallucination 2025.05.16 2025.05.28 Literature Database
Cost-Effective Hallucination Detection for LLMs Authors: Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang | Published: 2024-07-31 | Updated: 2024-08-09 HallucinationDetection of HallucinationsGenerative Model 2024.07.31 2025.05.27 Literature Database
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation Authors: A B M Ashikur Rahman, Saeed Anwar, Muhammad Usman, Ajmal Mian | Published: 2024-06-13 HallucinationModel EvaluationBias in Training Data 2024.06.13 2025.05.27 Literature Database
On Large Language Models’ Hallucination with Regard to Known Facts Authors: Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou | Published: 2024-03-29 | Updated: 2024-10-28 HallucinationDetection of HallucinationsModel Architecture 2024.03.29 2025.05.27 Literature Database
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models Authors: Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen | Published: 2024-01-06 LLM HallucinationHallucinationDetection of Hallucinations 2024.01.06 2025.05.27 Literature Database
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples Authors: Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, Yu-Yang Liu, Li Yuan | Published: 2023-10-02 | Updated: 2024-08-04 HallucinationVulnerability of Adversarial ExamplesAdversarial Learning 2023.10.02 2025.05.28 Literature Database
The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A” Authors: Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans | Published: 2023-09-21 | Updated: 2024-05-26 HallucinationModel EvaluationBias in Training Data 2023.09.21 2025.05.28 Literature Database