LLM性能評価

CS-Eval: A Comprehensive Large Language Model Benchmark for CyberSecurity

Authors: Zhengmin Yu, Jiutian Zeng, Siyi Chen, Wenhan Xu, Dandan Xu, Xiangyu Liu, Zonghao Ying, Nan Wang, Yuan Zhang, Min Yang | Published: 2024-11-25 | Updated: 2025-01-17
LLM性能評価
サイバーセキュリティ

PEEK: Phishing Evolution Framework for Phishing Generation and Evolving Pattern Analysis using Large Language Models

Authors: Fengchao Chen, Tingmin Wu, Van Nguyen, Shuo Wang, Alsharif Abuadbba, Carsten Rudolph | Published: 2024-11-18 | Updated: 2025-05-06
LLM性能評価
プロンプトリーキング
多様性の促進

On Calibration of LLM-based Guard Models for Reliable Content Moderation

Authors: Hongfu Liu, Hengguan Huang, Hao Wang, Xiangming Gu, Ye Wang | Published: 2024-10-14
LLM性能評価
コンテンツモデレーション
プロンプトインジェクション

Can LLMs be Scammed? A Baseline Measurement Study

Authors: Udari Madhushani Sehwag, Kelly Patel, Francesca Mosca, Vineeth Ravi, Jessica Staddon | Published: 2024-10-14
LLM性能評価
プロンプトインジェクション
評価手法

Decoding Secret Memorization in Code LLMs Through Token-Level Characterization

Authors: Yuqing Nie, Chong Wang, Kailong Wang, Guoai Xu, Guosheng Xu, Haoyu Wang | Published: 2024-10-11
LLM性能評価
プライバシー保護

PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

Authors: Tingchen Fu, Mrinank Sharma, Philip Torr, Shay B. Cohen, David Krueger, Fazl Barez | Published: 2024-10-11
LLM性能評価
バックドア攻撃
ポイズニング

Detecting Training Data of Large Language Models via Expectation Maximization

Authors: Gyuwan Kim, Yang Li, Evangelia Spiliopoulou, Jie Ma, Miguel Ballesteros, William Yang Wang | Published: 2024-10-10
LLM性能評価
メンバーシップ推論

RealVul: Can We Detect Vulnerabilities in Web Applications with LLM?

Authors: Di Cao, Yong Liao, Xiuwei Shang | Published: 2024-10-10
LLM性能評価
脆弱性管理

Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

Authors: Tong Wu, Shujian Zhang, Kaiqiang Song, Silei Xu, Sanqiang Zhao, Ravi Agrawal, Sathish Reddy Indurthi, Chong Xiang, Prateek Mittal, Wenxuan Zhou | Published: 2024-10-09
LLM性能評価
プロンプトインジェクション

Signal Watermark on Large Language Models

Authors: Zhenyu Xu, Victor S. Sheng | Published: 2024-10-09
LLM性能評価
ウォーターマーキング
透かし評価