LLM Performance Evaluation

Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection

Authors: S M Mostaq Hossain, Amani Altarawneh, Jesse Roberts | Published: 2025-01-04
LLM Performance Evaluation
Smart Contract

CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Authors: Johan Wahréus, Ahmed Mohamed Hussain, Panos Papadimitratos | Published: 2025-01-02
LLM Performance Evaluation
Cybersecurity
Prompt Injection

Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models

Authors: Minhao Bai, Jinshuai Yang, Kaiyi Pang, Yongfeng Huang, Yue Gao | Published: 2025-01-01
LLM Performance Evaluation
Data Obfuscation

SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity

Authors: Pengfei Jing, Mengyun Tang, Xiaorong Shi, Xing Zheng, Sen Nie, Shi Wu, Yong Yang, Xiapu Luo | Published: 2024-12-30 | Updated: 2025-01-06
LLM Performance Evaluation
Cybersecurity
Prompt Injection

Safeguarding System Prompts for LLMs

Authors: Zhifeng Jiang, Zhihua Jin, Guoliang He | Published: 2024-12-18 | Updated: 2025-01-09
LLM Performance Evaluation
Prompt Injection
Defense Method

Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection

Authors: Ira Ceka, Feitong Qiao, Anik Dey, Aastha Valecha, Gail Kaiser, Baishakhi Ray | Published: 2024-12-16 | Updated: 2025-01-18
LLM Performance Evaluation
Prompting Strategy
Prompt Injection

LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states

Authors: Luis Ibanez-Lissen, Lorena Gonzalez-Manzano, Jose Maria de Fuentes, Nicolas Anciaux, Joaquin Garcia-Alfaro | Published: 2024-11-29 | Updated: 2025-01-10
LLM Performance Evaluation
Membership Inference

CleanVul: Automatic Function-Level Vulnerability Detection in Code Commits Using LLM Heuristics

Authors: Yikun Li, Ting Zhang, Ratnadira Widyasari, Yan Naing Tun, Huu Hung Nguyen, Tan Bui, Ivana Clairine Irsan, Yiran Cheng, Xiang Lan, Han Wei Ang, Frank Liauw, Martin Weyssow, Hong Jin Kang, Eng Lieh Ouh, Lwin Khin Shar, David Lo | Published: 2024-11-26 | Updated: 2025-04-14
LLM Performance Evaluation
Code Change Analysis
Vulnerability Management

CS-Eval: A Comprehensive Large Language Model Benchmark for CyberSecurity

Authors: Zhengmin Yu, Jiutian Zeng, Siyi Chen, Wenhan Xu, Dandan Xu, Xiangyu Liu, Zonghao Ying, Nan Wang, Yuan Zhang, Min Yang | Published: 2024-11-25 | Updated: 2025-01-17
LLM Performance Evaluation
Cybersecurity

PEEK: Phishing Evolution Framework for Phishing Generation and Evolving Pattern Analysis using Large Language Models

Authors: Fengchao Chen, Tingmin Wu, Van Nguyen, Shuo Wang, Alsharif Abuadbba, Carsten Rudolph | Published: 2024-11-18 | Updated: 2025-05-06
LLM Performance Evaluation
Prompt leaking
Promotion of Diversity