Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Authors: William Hackett, Lewis Birch, Stefan Trawicki, Neeraj Suri, Peter Garraghan | Published: 2025-04-15 | Updated: 2025-04-16 LLM Performance EvaluationPrompt InjectionAdversarial Attack Analysis 2025.04.15 2025.05.27 Literature Database
CEE: An Inference-Time Jailbreak Defense for Embodied Intelligence via Subspace Concept Rotation Authors: Jirui Yang, Zheyu Lin, Zhihui Lu, Yinggui Wang, Lei Wang, Tao Wei, Xin Du, Shuhan Yang | Published: 2025-04-15 | Updated: 2025-07-31 Prompt InjectionRobustness of Watermarking TechniquesDefense Effectiveness Analysis 2025.04.15 2025.08.02 Literature Database
Can LLMs Handle WebShell Detection? Overcoming Detection Challenges with Behavioral Function-Aware Framework Authors: Feijiang Han, Jiaming Zhang, Chuyi Deng, Jianheng Tang, Yunhuai Liu | Published: 2025-04-14 | Updated: 2025-08-26 Data Generation MethodProgram AnalysisPrompt leaking 2025.04.14 2025.08.28 Literature Database
Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design Authors: Andreas Happe, Jürgen Cito | Published: 2025-04-14 TestbedPrompt validationProgress Tracking 2025.04.14 2025.05.27 Literature Database
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models? Authors: Yanbo Wang, Jiyang Guan, Jian Liang, Ran He | Published: 2025-04-14 Prompt InjectionBias in Training DataSafety Alignment 2025.04.14 2025.05.27 Literature Database
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models Authors: Yang Feng, Xudong Pan | Published: 2025-04-14 LLM Performance EvaluationIndirect Prompt InjectionMalicious Website Detection 2025.04.14 2025.05.27 Literature Database
An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection Authors: Qiyao Tang, Xiangyang Li | Published: 2025-04-14 LLM Performance EvaluationPrompt InjectionModel DoS 2025.04.14 2025.05.27 Literature Database
ControlNET: A Firewall for RAG-based LLM System Authors: Hongwei Yao, Haoran Shi, Yidou Chen, Yixin Jiang, Cong Wang, Zhan Qin | Published: 2025-04-13 | Updated: 2025-04-17 Poisoning attack on RAGIndirect Prompt InjectionData Breach Risk 2025.04.13 2025.05.27 Literature Database
CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent Authors: Liang-bo Ning, Shijie Wang, Wenqi Fan, Qing Li, Xin Xu, Hao Chen, Feiran Huang | Published: 2025-04-13 | Updated: 2025-04-24 Indirect Prompt InjectionPrompt InjectionAttacker Behavior Analysis 2025.04.13 2025.05.27 Literature Database
Detecting Instruction Fine-tuning Attacks on Language Models using Influence Function Authors: Jiawei Li | Published: 2025-04-12 | Updated: 2025-09-30 Backdoor AttackPrompt validationSentiment Analysis 2025.04.12 2025.10.02 Literature Database