Disabling Safety Mechanisms of LLM

Decoupling Reconnaissance and Exploitation: Measuring the Capability Boundaries of LLM-Based Web Penetration Testing

Authors: Liwei Yu, Shuo Li, Ming Zhou, Ge Chu, Yan Guo | Published: 2026-06-24

エージェント設計

自動化ペネトレーションテスト

2026.06.24 2026.06.26

Literature Database

Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs

Authors: Charles Westphal, Timothy Douglas, Keivan Navaie, Tiago Pimentel, Fernando E. Rosas | Published: 2026-06-08

Disabling Safety Mechanisms of LLM

倫理基準遵守

Research Methodology

2026.06.08 2026.06.10

Literature Database

Steganography Without Modification: Hidden Communication via LLM Seeds

Authors: Felix Mächtle, Jonas Sander, Sebastian Berndt, Ben Weimar, Nils Loose, Thomas Eisenbarth | Published: 2026-06-08

Disabling Safety Mechanisms of LLM

Token Identification Method

Probability distribution

2026.06.08 2026.06.10

Literature Database

Dissecting the Black Box: Circuit-Level Analysis of LLM Vulnerability Detection

Authors: Syafiq Al Atiiq, Chun Zhou, Christian Gehrmann | Published: 2026-05-28

Disabling Safety Mechanisms of LLM

Model Architecture

Interpretation Method

2026.05.28 2026.05.30

Literature Database

SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

Authors: Almene De Meran Meguimtsop, Maria Leonor Pacheco, Daniel E. Acuna | Published: 2026-05-28

Disabling Safety Mechanisms of LLM

Indirect Prompt Injection

Author Contribution

2026.05.28 2026.05.30

Literature Database

Cordyceps: Covert Control Attacks on LLMs via Data Poisoning

Authors: Zedian Shao, Charles Fleming, Teodora Baluta | Published: 2026-05-26

Disabling Safety Mechanisms of LLM

Robustness Evaluation

Watermark Robustness

2026.05.26 2026.05.28

Literature Database

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks

Authors: Kevin Kuo, Chhavi Yadav, Virginia Smith | Published: 2026-05-26

Disabling Safety Mechanisms of LLM

Robustness Evaluation

防御手法の統合

2026.05.26 2026.05.28

Literature Database

Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution

Authors: Xiaozhe Zhang, Chaozhuo Li, Hui Liu, Shaocheng Yan, Bingyu Yan, Qiwei Ye, Haoliang Li | Published: 2026-05-13

Disabling Safety Mechanisms of LLM

Alignment

Behavior Analysis Method

2026.05.13 2026.05.15

Literature Database

Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing

Authors: Zheng Lin, Zhenxing Niu, Haoxuan Ji, Haichang Gao | Published: 2026-05-11

Disabling Safety Mechanisms of LLM

Prompt Injection

Model Robustness

2026.05.11 2026.05.13

Literature Database

Usability as a Weapon: Attacking the Safety of LLM-Based Code Generation via Usability Requirements

Authors: Yue Li, Xiao Li, Hao Wu, Yue Zhang, Yechao Zhang, Yating Liu, Fengyuan Xu, Sheng Zhong | Published: 2026-05-11

Disabling Safety Mechanisms of LLM

セキュリティとユーザビリティのトレードオフ

Attack Evaluation

2026.05.11 2026.05.13

Literature Database