ハルシネーション

DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response

Authors: Bilel Cherif, Tamas Bisztray, Richard A. Dubniczky, Aaesha Aldahmani, Saeed Alshehhi, Norbert Tihanyi | Published: 2025-05-26
ハルシネーション
モデル性能評価
評価手法

VADER: A Human-Evaluated Benchmark for Vulnerability Assessment, Detection, Explanation, and Remediation

Authors: Ethan TS. Liu, Austin Wang, Spencer Mateega, Carlos Georgescu, Danny Tang | Published: 2025-05-26
ウェブサイト脆弱性
ハルシネーション
動的脆弱性管理

Phare: A Safety Probe for Large Language Models

Authors: Pierre Le Jeune, Benoît Malézieux, Weixuan Xiao, Matteo Dora | Published: 2025-05-16 | Updated: 2025-05-19
RAG
バイアス緩和手法
ハルシネーション

Cost-Effective Hallucination Detection for LLMs

Authors: Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang | Published: 2024-07-31 | Updated: 2024-08-09
ハルシネーション
ハルシネーションの検知
生成モデル

DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation

Authors: A B M Ashikur Rahman, Saeed Anwar, Muhammad Usman, Ajmal Mian | Published: 2024-06-13
ハルシネーション
モデル評価
学習データの偏り

On Large Language Models’ Hallucination with Regard to Known Facts

Authors: Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou | Published: 2024-03-29 | Updated: 2024-10-28
ハルシネーション
ハルシネーションの検知
モデルアーキテクチャ

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

Authors: Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen | Published: 2024-01-06
LLMの幻覚
ハルシネーション
ハルシネーションの検知

LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples

Authors: Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, Yu-Yang Liu, Li Yuan | Published: 2023-10-02 | Updated: 2024-08-04
ハルシネーション
敵対的サンプルの脆弱性
敵対的学習

The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”

Authors: Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans | Published: 2023-09-21 | Updated: 2024-05-26
ハルシネーション
モデル評価
学習データの偏り

Why Does ChatGPT Fall Short in Providing Truthful Answers?

Authors: Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang | Published: 2023-04-20 | Updated: 2023-12-03
ハルシネーション
情報抽出
音楽ジャンル