Disabling Safety Mechanisms of LLM

Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning

Authors: Zhiyuan Chang, Mingyang Li, Yuekai Huang, Ziyou Jiang, Xiaojun Jia, Qian Xiong, Junjie Wang, Zhaoyang Li, Qing Wang | Published: 2026-01-08
Disabling Safety Mechanisms of LLM
Indirect Prompt Injection
Privacy Protection Method

Adversarial Contrastive Learning for LLM Quantization Attacks

Authors: Dinghong Song, Zhiwei Xu, Hai Wan, Xibin Zhao, Pengfei Su, Dong Li | Published: 2026-01-06
Disabling Safety Mechanisms of LLM
Model Extraction Attack
Quantization and Privacy

EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion

Authors: Zhen Liang, Hai Huang, Zhengkui Chen | Published: 2025-12-29
Disabling Safety Mechanisms of LLM
LLM活用
Prompt Injection

Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography

Authors: Songze Li, Jiameng Cheng, Yiming Li, Xiaojun Jia, Dacheng Tao | Published: 2025-12-23
Disabling Safety Mechanisms of LLM
Prompt Injection
マルチモーダル安全性

Can LLMs Make (Personalized) Access Control Decisions?

Authors: Friederike Groschupp, Daniele Lain, Aritra Dhar, Lara Magdalena Lazier, Srdjan Čapkun | Published: 2025-11-25
Disabling Safety Mechanisms of LLM
Privacy Assessment
Prompt Injection

Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation

Authors: Junbo Zhang, Ran Chen, Qianli Zhou, Xinyang Deng, Wen Jiang | Published: 2025-11-24
Disabling Safety Mechanisms of LLM
Prompt Injection
Malicious Prompt

Black-Box Guardrail Reverse-engineering Attack

Authors: Hongwei Yao, Yun Xia, Shuo Shao, Haoran Shi, Tong Qiao, Cong Wang | Published: 2025-11-06
Disabling Safety Mechanisms of LLM
Prompt leaking
Information Security

Death by a Thousand Prompts: Open Model Vulnerability Analysis

Authors: Amy Chang, Nicholas Conley, Harish Santhanalakshmi Ganesan, Adam Swanda | Published: 2025-11-05
Disabling Safety Mechanisms of LLM
Indirect Prompt Injection
Threat modeling

Multimodal Safety Is Asymmetric: Cross-Modal Exploits Unlock Black-Box MLLMs Jailbreaks

Authors: Xinkai Wang, Beibei Li, Zerui Shao, Ao Liu, Shouling Ji | Published: 2025-10-20
Disabling Safety Mechanisms of LLM
Prompt Injection
Malicious Content Generation

Proactive defense against LLM Jailbreak

Authors: Weiliang Zhao, Jinjun Peng, Daniel Ben-Levi, Zhou Yu, Junfeng Yang | Published: 2025-10-06
Disabling Safety Mechanisms of LLM
Prompt Injection
防御手法の統合