Disabling Safety Mechanisms of LLM

JULI: Jailbreak Large Language Models by Self-Introspection

Authors: Jesson Wang, Zhanhao Hu, David Wagner | Published: 2025-05-17 | Updated: 2025-05-20

API Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.05.17 2025.05.28

Literature Database

Dark LLMs: The Growing Threat of Unaligned AI Models

Authors: Michael Fire, Yitzhak Elbazis, Adi Wasenstein, Lior Rokach | Published: 2025-05-15

Disabling Safety Mechanisms of LLM

Prompt Injection

Large Language Model

2025.05.15 2025.05.28

Literature Database

PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization

Authors: Yidan Wang, Yanan Cao, Yubing Ren, Fang Fang, Zheng Lin, Binxing Fang | Published: 2025-05-15

Disabling Safety Mechanisms of LLM

Prompt Injection

Privacy Protection in Machine Learning

2025.05.15 2025.05.28

Literature Database

One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models

Authors: Haoran Gu, Handing Wang, Yi Mei, Mengjie Zhang, Yaochu Jin | Published: 2025-05-12

LLM Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.05.12 2025.05.28

Literature Database

I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference

Authors: Zibo Gao, Junjie Hu, Feng Guo, Yixin Zhang, Yinglong Han, Siyuan Liu, Haiyang Li, Zhiqiang Lv | Published: 2025-05-10 | Updated: 2025-05-14

Disabling Safety Mechanisms of LLM

Prompt leaking

Attack Detection Method

2025.05.10 2025.05.28

Literature Database

Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs

Authors: Chetan Pathade | Published: 2025-05-07 | Updated: 2025-05-13

LLM Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.05.07 2025.05.28

Literature Database

XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Authors: Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera, Vinod P | Published: 2025-04-30

Disabling Safety Mechanisms of LLM

Prompt Injection

Explanation Method

2025.04.30 2025.05.27

Literature Database

LLM-IFT: LLM-Powered Information Flow Tracking for Secure Hardware

Authors: Nowfel Mashnoor, Mohammad Akyash, Hadi Kamali, Kimia Azar | Published: 2025-04-09

Disabling Safety Mechanisms of LLM

Framework

Efficient Configuration Verification

2025.04.09 2025.05.27

Literature Database

Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms

Authors: Shuoming Zhang, Jiacheng Zhao, Ruiyuan Xu, Xiaobing Feng, Huimin Cui | Published: 2025-03-31

LLM Security

Disabling Safety Mechanisms of LLM

Prompt Injection

2025.03.31 2025.05.27

Literature Database

Align in Depth: Defending Jailbreak Attacks via Progressive Answer Detoxification

Authors: Yingjie Zhang, Tong Liu, Zhe Zhao, Guozhu Meng, Kai Chen | Published: 2025-03-14

Disabling Safety Mechanisms of LLM

Prompt Injection

Malicious Prompt

2025.03.14 2025.05.27

Literature Database