Representation Bending for Large Language Model Safety Authors: Ashkan Yousefpour, Taeheon Kim, Ryan S. Kwon, Seungbeen Lee, Wonje Jeung, Seungju Han, Alvin Wan, Harrison Ngan, Youngjae Yu, Jonghyun Choi | Published: 2025-04-02 Prompt InjectionPrompt leakingSafety Alignment 2025.04.02 2025.05.27 Literature Database
LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution Authors: Zhuoran Yang, Jie Peng, Zhen Tan, Tianlong Chen, Yanyong Zhang | Published: 2025-04-02 Prompt InjectionModel Performance EvaluationUncertainty Measurement 2025.04.02 2025.05.27 Literature Database
No Free Lunch with Guardrails Authors: Divyanshu Kumar, Nitin Aravind Birur, Tanay Baswa, Sahil Agarwal, Prashanth Harshangi | Published: 2025-04-01 | Updated: 2025-04-03 Prompt InjectionModel DoSInformation Security 2025.04.01 2025.05.27 Literature Database
Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms Authors: Shuoming Zhang, Jiacheng Zhao, Ruiyuan Xu, Xiaobing Feng, Huimin Cui | Published: 2025-03-31 LLM SecurityDisabling Safety Mechanisms of LLMPrompt Injection 2025.03.31 2025.05.27 Literature Database
Detecting Functional Bugs in Smart Contracts through LLM-Powered and Bug-Oriented Composite Analysis Authors: Binbin Zhao, Xingshuang Lin, Yuan Tian, Saman Zonouz, Na Ruan, Jiliang Li, Raheem Beyah, Shouling Ji | Published: 2025-03-31 Indirect Prompt InjectionSmart Contract AuditPrompt Injection 2025.03.31 2025.05.27 Literature Database
MiZero: The Shadowy Defender Against Text Style Infringements Authors: Ziwei Zhang, Juan Wen, Wanli Peng, Zhengxian Wu, Yinghan Zhou, Yiming Xue | Published: 2025-03-30 | Updated: 2025-05-29 Prompt InjectionIntellectual Property ProtectionWatermarking Technology 2025.03.30 2025.05.31 Literature Database
Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing Authors: Johan Wahréus, Ahmed Hussain, Panos Papadimitratos | Published: 2025-03-27 System DevelopmentPrompt InjectionLarge Language Model 2025.03.27 2025.05.27 Literature Database
Defeating Prompt Injections by Design Authors: Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, Florian Tramèr | Published: 2025-03-24 Indirect Prompt InjectionPrompt Injection 2025.03.24 2025.05.27 Literature Database
Large Language Models powered Network Attack Detection: Architecture, Opportunities and Case Study Authors: Xinggong Zhang, Qingyang Li, Yunpeng Tan, Zongming Guo, Lei Zhang, Yong Cui | Published: 2025-03-24 Prompt InjectionPrompt leakingIntrusion Detection System 2025.03.24 2025.05.27 Literature Database
Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection Authors: Fei Zuo, Junghwan Rhee, Yung Ryn Choe | Published: 2025-03-24 Cyber Threat IntelligencePrompt InjectionInformation Extraction 2025.03.24 2025.05.27 Literature Database