LLM性能評価

PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics

Authors: Derui Zhu, Dingfan Chen, Qing Li, Zongxiong Chen, Lei Ma, Jens Grossklags, Mario Fritz | Published: 2024-04-06
LLMセキュリティ
LLM性能評価
評価手法

Digital Forgetting in Large Language Models: A Survey of Unlearning Methods

Authors: Alberto Blanco-Justicia, Najeeb Jebreel, Benet Manzanares, David Sánchez, Josep Domingo-Ferrer, Guillem Collell, Kuan Eeik Tan | Published: 2024-04-02
LLM性能評価
プロンプトインジェクション
機械学習の忘却

Enhancing Reasoning Capacity of SLM using Cognitive Enhancement

Authors: Jonathan Pan, Swee Liang Wong, Xin Wei Chia, Yidi Yuan | Published: 2024-04-01
LLM性能評価
モデル性能評価
ログ解析手法

Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

Authors: Shan Jia, Reilin Lyu, Kangran Zhao, Yize Chen, Zhiyuan Yan, Yan Ju, Chuanbo Hu, Xin Li, Baoyuan Wu, Siwei Lyu | Published: 2024-03-21 | Updated: 2024-06-11
LLM性能評価
モデル性能評価
透かし評価

Leveraging Large Language Models to Detect npm Malicious Packages

Authors: Nusrat Zahan, Philipp Burckhardt, Mikola Lysenko, Feross Aboukhadijeh, Laurie Williams | Published: 2024-03-18 | Updated: 2025-01-06
LLM性能評価
プロンプトインジェクション
マルウェア分類

Helpful or Harmful? Exploring the Efficacy of Large Language Models for Online Grooming Prevention

Authors: Ellie Prosser, Matthew Edwards | Published: 2024-03-14
LLM性能評価
オンライン安全性アドバイス
プロンプトインジェクション

PRSA: PRompt Stealing Attacks against Large Language Models

Authors: Yong Yang, Changjiang Li, Yi Jiang, Xi Chen, Haoyu Wang, Xuhong Zhang, Zonghui Wang, Shouling Ji | Published: 2024-02-29 | Updated: 2024-06-08
LLM性能評価
プロンプトインジェクション
プロンプトエンジニアリング

Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

Authors: Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, Kai Chen | Published: 2024-02-28 | Updated: 2024-06-10
LLMセキュリティ
LLM性能評価
プロンプトインジェクション

TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification

Authors: Martin Gubri, Dennis Ulmer, Hwaran Lee, Sangdoo Yun, Seong Joon Oh | Published: 2024-02-20 | Updated: 2024-06-06
LLMセキュリティ
LLM性能評価
プロンプトインジェクション

An Empirical Evaluation of LLMs for Solving Offensive Security Challenges

Authors: Minghao Shao, Boyuan Chen, Sofija Jancheska, Brendan Dolan-Gavitt, Siddharth Garg, Ramesh Karri, Muhammad Shafique | Published: 2024-02-19
LLM性能評価
プロンプトインジェクション
教育目的のCTF