ハルシネーション

SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From

Authors: Yao Tong, Haonan Wang, Siquan Li, Kenji Kawaguchi, Tianyang Hu | Published: 2025-09-30
トークン分布分析
ハルシネーション
モデル性能評価

Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark

Authors: Xinjie Shen, Mufei Li, Pan Li | Published: 2025-09-27 | Updated: 2025-10-13
ハルシネーション
プライバシー保護技術
倫理的選択評価

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Authors: Alexander Panfilov, Evgenii Kortukov, Kristina Nikolić, Matthias Bethge, Sebastian Lapuschkin, Wojciech Samek, Ameya Prabhu, Maksym Andriushchenko, Jonas Geiping | Published: 2025-09-22
ハルシネーション
武器設計手法
詐欺手法

Proof-Carrying Numbers (PCN): A Protocol for Trustworthy Numeric Answers from LLMs via Claim Verification

Authors: Aivin V. Solatorio | Published: 2025-09-08
ハルシネーション
効率的証明システム
監査手法

AICrypto: A Comprehensive Benchmark for Evaluating Cryptography Capabilities of Large Language Models

Authors: Yu Wang, Yijian Liu, Liheng Ji, Han Luo, Wenjie Li, Xiaofei Zhou, Chiyun Feng, Puji Wang, Yuhan Cao, Geyuan Zhang, Xiaojian Li, Rongwu Xu, Yilei Chen, Tianxing He | Published: 2025-07-13 | Updated: 2025-09-30
アルゴリズム
ハルシネーション
プロンプトの検証

Using LLMs for Security Advisory Investigations: How Far Are We?

Authors: Bayu Fedra Abdullah, Yusuf Sulistyo Nugroho, Brittany Reid, Raula Gaikovina Kula, Kazumasa Shimari, Kenichi Matsumoto | Published: 2025-06-16
アドバイス提供
ハルシネーション
プロンプトリーキング

DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response

Authors: Bilel Cherif, Tamas Bisztray, Richard A. Dubniczky, Aaesha Aldahmani, Saeed Alshehhi, Norbert Tihanyi | Published: 2025-05-26
ハルシネーション
モデル性能評価
評価手法

VADER: A Human-Evaluated Benchmark for Vulnerability Assessment, Detection, Explanation, and Remediation

Authors: Ethan TS. Liu, Austin Wang, Spencer Mateega, Carlos Georgescu, Danny Tang | Published: 2025-05-26
ウェブサイト脆弱性
ハルシネーション
動的脆弱性管理

Phare: A Safety Probe for Large Language Models

Authors: Pierre Le Jeune, Benoît Malézieux, Weixuan Xiao, Matteo Dora | Published: 2025-05-16 | Updated: 2025-05-19
RAG
バイアス緩和手法
ハルシネーション

Cost-Effective Hallucination Detection for LLMs

Authors: Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang | Published: 2024-07-31 | Updated: 2024-08-09
ハルシネーション
ハルシネーションの検知
生成モデル