Prompt Injection

Knowledge-to-Data: LLM-Driven Synthesis of Structured Network Traffic for Testbed-Free IDS Evaluation

Authors: Konstantinos E. Kampourakis, Vyron Kampourakis, Efstratios Chatzoglou, Georgios Kambourakis, Stefanos Gritzalis | Published: 2026-01-08
LLM活用
Prompt Injection
Intrusion Detection System

Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks

Authors: Hoagy Cunningham, Jerry Wei, Zihan Wang, Andrew Persic, Alwin Peng, Jordan Abderrachid, Raj Agarwal, Bobby Chen, Austin Cohen, Andy Dau, Alek Dimitriev, Rob Gilson, Logan Howard, Yijin Hua, Jared Kaplan, Jan Leike, Mu Lin, Christopher Liu, Vladimir Mikulik, Rohit Mittapalli, Clare O'Hara, Jin Pan, Nikhil Saxena, Alex Silverstein, Yue Song, Xunjie Yu, Giulio Zhou, Ethan Perez, Mrinank Sharma | Published: 2026-01-08
Prompt Injection
Robustness Analysis
Robustness of Deep Networks

HoneyTrap: Deceiving Large Language Model Attackers to Honeypot Traps with Resilient Multi-Agent Defense

Authors: Siyuan Li, Xi Lin, Jun Wu, Zehao Liu, Haoyu Li, Tianjie Ju, Xiang Chen, Jianhua Li | Published: 2026-01-07
Prompt Injection
Large Language Model
Adversarial Attack Detection

Jailbreaking LLMs & VLMs: Mechanisms, Evaluation, and Unified Defense

Authors: Zejian Chen, Chaozhuo Li, Chao Li, Xi Zhang, Litian Zhang, Yiming He | Published: 2026-01-07
Prompt Injection
Large Language Model
Adversarial Attack Detection

JPU: Bridging Jailbreak Defense and Unlearning via On-Policy Path Rectification

Authors: Xi Wang, Songlei Jian, Shasha Li, Xiaopeng Li, Zhaoye Li, Bin Ji, Baosheng Wang, Jie Yu | Published: 2026-01-06
Prompt Injection
Model Extraction Attack
Adversarial Attack Detection

EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion

Authors: Zhen Liang, Hai Huang, Zhengkui Chen | Published: 2025-12-29
Disabling Safety Mechanisms of LLM
LLM活用
Prompt Injection

Casting a SPELL: Sentence Pairing Exploration for LLM Limitation-breaking

Authors: Yifan Huang, Xiaojun Jia, Wenbo Guo, Yuqiang Sun, Yihao Huang, Chong Wang, Yang Liu | Published: 2025-12-24
Data Selection Strategy
Prompt Injection
Adversarial Attack Detection

AegisAgent: An Autonomous Defense Agent Against Prompt Injection Attacks in LLM-HARs

Authors: Yihan Wang, Huanqi Yang, Shantanu Pal, Weitao Xu | Published: 2025-12-24
Indirect Prompt Injection
Prompt Injection
Adversarial Attack Assessment

Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography

Authors: Songze Li, Jiameng Cheng, Yiming Li, Xiaojun Jia, Dacheng Tao | Published: 2025-12-23
Disabling Safety Mechanisms of LLM
Prompt Injection
マルチモーダル安全性

On the Effectiveness of Instruction-Tuning Local LLMs for Identifying Software Vulnerabilities

Authors: Sangryu Park, Gihyuk Ko, Homook Cho | Published: 2025-12-23
Prompt Injection
Large Language Model
Vulnerability Analysis