Hoist with His Own Petard: Inducing Guardrails to Facilitate Denial-of-Service Attacks on Retrieval-Augmented Generation of LLMs Authors: Pan Suo, Yu-Ming Shang, San-Chuan Guo, Xi Zhang | Published: 2025-04-30 LLM Performance EvaluationPoisoning attack on RAGAttack Type 2025.04.30 2025.05.27 Literature Database
Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code Authors: Md. Azizul Hakim Bappy, Hossen A Mustafa, Prottoy Saha, Rajinus Salehat | Published: 2025-04-23 LLM Performance EvaluationTraining MethodPrompt leaking 2025.04.23 2025.05.27 Literature Database
aiXamine: Simplified LLM Safety and Security Authors: Fatih Deniz, Dorde Popovic, Yazan Boshmaf, Euisuh Jeong, Minhaj Ahmad, Sanjay Chawla, Issa Khalil | Published: 2025-04-21 | Updated: 2025-04-23 LLM Performance EvaluationAlignmentPerformance Evaluation 2025.04.21 2025.05.27 Literature Database
Watermarking Needs Input Repetition Masking Authors: David Khachaturov, Robert Mullins, Ilia Shumailov, Sumanth Dathathri | Published: 2025-04-16 LLM Performance EvaluationPrompt validationWatermark Design 2025.04.16 2025.05.27 Literature Database
The Digital Cybersecurity Expert: How Far Have We Come? Authors: Dawei Wang, Geng Zhou, Xianglong Li, Yu Bai, Li Chen, Ting Qin, Jian Sun, Dan Li | Published: 2025-04-16 LLM Performance EvaluationPoisoning attack on RAGPrompt Injection 2025.04.16 2025.05.27 Literature Database
Progent: Programmable Privilege Control for LLM Agents Authors: Tianneng Shi, Jingxuan He, Zhun Wang, Linyu Wu, Hongwei Li, Wenbo Guo, Dawn Song | Published: 2025-04-16 LLM Performance EvaluationIndirect Prompt InjectionPrivacy Protection Mechanism 2025.04.16 2025.05.27 Literature Database
Exploring Backdoor Attack and Defense for LLM-empowered Recommendations Authors: Liangbo Ning, Wenqi Fan, Qing Li | Published: 2025-04-15 LLM Performance EvaluationPoisoning attack on RAGAdversarial Attack Analysis 2025.04.15 2025.05.27 Literature Database
Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Authors: William Hackett, Lewis Birch, Stefan Trawicki, Neeraj Suri, Peter Garraghan | Published: 2025-04-15 | Updated: 2025-04-16 LLM Performance EvaluationPrompt InjectionAdversarial Attack Analysis 2025.04.15 2025.05.27 Literature Database
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models Authors: Yang Feng, Xudong Pan | Published: 2025-04-14 LLM Performance EvaluationIndirect Prompt InjectionMalicious Website Detection 2025.04.14 2025.05.27 Literature Database
An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection Authors: Qiyao Tang, Xiangyang Li | Published: 2025-04-14 LLM Performance EvaluationPrompt InjectionModel DoS 2025.04.14 2025.05.27 Literature Database