LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning Authors: Yuqiang Sun, Daoyuan Wu, Yue Xue, Han Liu, Wei Ma, Lyuye Zhang, Yang Liu, Yingjiu Li | Published: 2024-01-29 | Updated: 2025-01-13 LLM Performance EvaluationPrompt InjectionVulnerability Management 2024.01.29 2025.05.27 Literature Database
Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness Authors: Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira | Published: 2024-01-26 | Updated: 2024-04-19 LLM Performance EvaluationCybersecurityPrompt Injection 2024.01.26 2025.05.27 Literature Database
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models Authors: Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li | Published: 2024-01-20 LLM Performance EvaluationBackdoor AttackPrompt Injection 2024.01.20 2025.05.27 Literature Database
LLM4Fuzz: Guided Fuzzing of Smart Contracts with Large Language Models Authors: Chaofan Shou, Jing Liu, Doudou Lu, Koushik Sen | Published: 2024-01-20 LLM Performance EvaluationSmart ContractProgram Analysis 2024.01.20 2025.05.27 Literature Database
LLbezpeky: Leveraging Large Language Models for Vulnerability Detection Authors: Noble Saji Mathews, Yelizaveta Brus, Yousra Aafer, Meiyappan Nagappan, Shane McIntosh | Published: 2024-01-02 | Updated: 2024-02-13 LLM Performance EvaluationPrompt InjectionVulnerability Management 2024.01.02 2025.05.27 Literature Database
Digger: Detecting Copyright Content Mis-usage in Large Language Model Training Authors: Haodong Li, Gelei Deng, Yi Liu, Kailong Wang, Yuekang Li, Tianwei Zhang, Yang Liu, Guoai Xu, Guosheng Xu, Haoyu Wang | Published: 2024-01-01 LLM Performance EvaluationDataset GenerationPrompt Injection 2024.01.01 2025.05.27 Literature Database
SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security Authors: Zefang Liu | Published: 2023-12-26 LLM Performance EvaluationCybersecurityPrompt Injection 2023.12.26 2025.05.27 Literature Database
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models Authors: Xin Jin, Jonathan Larson, Weiwei Yang, Zhiqiang Lin | Published: 2023-12-15 LLM Performance EvaluationProgram AnalysisPrompt Injection 2023.12.15 2025.05.27 Literature Database
LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature Authors: Maxime Würsch, Andrei Kucharavy, Dimitri Percia David, Alain Mermoud | Published: 2023-12-12 LLM Performance EvaluationData PreprocessingKnowledge Extraction Method 2023.12.12 2025.05.28 Literature Database
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks Authors: Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas | Published: 2023-10-05 | Updated: 2024-06-11 LLM Performance EvaluationPrompt InjectionDefense Method 2023.10.05 2025.05.28 Literature Database