LLM性能評価

CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge

Authors: Norbert Tihanyi, Mohamed Amine Ferrag, Ridhi Jain, Tamas Bisztray, Merouane Debbah | Published: 2024-02-12 | Updated: 2024-06-03
LLM性能評価
サイバーセキュリティ
データセット生成

Differentially Private Training of Mixture of Experts Models

Authors: Pierre Tholoniat, Huseyin A. Inan, Janardhan Kulkarni, Robert Sim | Published: 2024-02-11
LLM性能評価
プライバシー保護手法
モデル性能評価

In-Context Learning Can Re-learn Forbidden Tasks

Authors: Sophie Xhonneux, David Dobre, Jian Tang, Gauthier Gidel, Dhanya Sridhar | Published: 2024-02-08
Few-Shot Learning
LLMセキュリティ
LLM性能評価

Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia

Authors: Guangyu Shen, Siyuan Cheng, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Lu Yan, Zhuo Zhang, Shiqing Ma, Xiangyu Zhang | Published: 2024-02-08
LLMセキュリティ
LLM性能評価
プロンプトインジェクション

SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models

Authors: Lijun Li, Bowen Dong, Ruohui Wang, Xuhao Hu, Wangmeng Zuo, Dahua Lin, Yu Qiao, Jing Shao | Published: 2024-02-07 | Updated: 2024-06-07
LLMセキュリティ
LLM性能評価
プロンプトインジェクション

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Authors: Ran Elgedawy, John Sadik, Senjuti Dutta, Anuj Gautam, Konstantinos Georgiou, Farzin Gholamrezae, Fujiao Ji, Kyungchan Lim, Qian Liu, Scott Ruoti | Published: 2024-02-01
LLM性能評価
コード生成
プロンプトインジェクション

LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning

Authors: Yuqiang Sun, Daoyuan Wu, Yue Xue, Han Liu, Wei Ma, Lyuye Zhang, Yang Liu, Yingjiu Li | Published: 2024-01-29 | Updated: 2025-01-13
LLM性能評価
プロンプトインジェクション
脆弱性管理

Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness

Authors: Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira | Published: 2024-01-26 | Updated: 2024-04-19
LLM性能評価
サイバーセキュリティ
プロンプトインジェクション

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Authors: Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li | Published: 2024-01-20
LLM性能評価
バックドア攻撃
プロンプトインジェクション

LLM4Fuzz: Guided Fuzzing of Smart Contracts with Large Language Models

Authors: Chaofan Shou, Jing Liu, Doudou Lu, Koushik Sen | Published: 2024-01-20
LLM性能評価
スマートコントラクト
プログラム解析