Prompt Injection

Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability

Authors: Shova Kuikel, Aritran Piplai, Palvi Aggarwal | Published: 2025-06-16
Alignment
Prompt Injection
Large Language Model

Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models

Authors: Arjun Krishna, Aaditya Rastogi, Erick Galinkin | Published: 2025-06-16
Prompt Injection
Large Language Model
Adversarial Attack Methods

SoK: Evaluating Jailbreak Guardrails for Large Language Models

Authors: Xunguang Wang, Zhenlan Ji, Wenxuan Wang, Zongjie Li, Daoyuan Wu, Shuai Wang | Published: 2025-06-12
Prompt Injection
Trade-Off Between Safety And Usability
脱獄攻撃手法

SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks

Authors: Kaiyuan Zhang, Siyuan Cheng, Hanxi Guo, Yuetian Chen, Zian Su, Shengwei An, Yuntao Du, Charles Fleming, Ashish Kundu, Xiangyu Zhang, Ninghui Li | Published: 2025-06-12
Privacy Protection Method
Prompt Injection
Prompt leaking

ELFuzz: Efficient Input Generation via LLM-driven Synthesis Over Fuzzer Space

Authors: Chuyang Chen, Brendan Dolan-Gavitt, Zhiqiang Lin | Published: 2025-06-12
Fuzzing
Prompt Injection
効率的入力生成

LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge

Authors: Sahar Abdelnabi, Aideen Fay, Ahmed Salem, Egor Zverev, Kai-Chieh Liao, Chi-Huang Liu, Chun-Chih Kuo, Jannis Weigend, Danyael Manlangit, Alex Apostolov, Haris Umair, João Donato, Masayuki Kawakita, Athar Mahboob, Tran Huu Bach, Tsun-Han Chiang, Myeongjin Cho, Hajin Choi, Byeonghyeon Kim, Hyeonjin Lee, Benjamin Pannell, Conor McCauley, Mark Russinovich, Andrew Paverd, Giovanni Cherubin | Published: 2025-06-11
Indirect Prompt Injection
Prompt Injection
Defense Method

LLMs Cannot Reliably Judge (Yet?): A Comprehensive Assessment on the Robustness of LLM-as-a-Judge

Authors: Songze Li, Chuokun Xu, Jiaying Wang, Xueluan Gong, Chen Chen, Jirui Zhang, Jun Wang, Kwok-Yan Lam, Shouling Ji | Published: 2025-06-11
Disabling Safety Mechanisms of LLM
Prompt Injection
Adversarial attack

Design Patterns for Securing LLM Agents against Prompt Injections

Authors: Luca Beurer-Kellner, Beat Buesser Ana-Maria Creţu, Edoardo Debenedetti, Daniel Dobos, Daniel Fabian, Marc Fischer, David Froelicher, Kathrin Grosse, Daniel Naeff, Ezinwanne Ozoani, Andrew Paverd, Florian Tramèr, Václav Volhejn | Published: 2025-06-10 | Updated: 2025-06-11
Indirect Prompt Injection
Prompt Injection
Defense Method

MalGEN: A Generative Agent Framework for Modeling Malicious Software in Cybersecurity

Authors: Bikash Saha, Sandeep Kumar Shukla | Published: 2025-06-09
Cyber Threat
Prompt Injection
マルウェア生成

JavelinGuard: Low-Cost Transformer Architectures for LLM Security

Authors: Yash Datta, Sharath Rajasekar | Published: 2025-06-09
Privacy Enhancing Technology
Prompt Injection
Model Architecture