SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism Authors: Beitao Chen, Xinyu Lyu, Lianli Gao, Jingkuan Song, Heng Tao Shen | Published: 2025-07-02 Prompt Injection脱獄攻撃手法Transparency and Verification 2025.07.02 2025.07.04 Literature Database
MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks for LLMs Authors: Boyuan Chen, Minghao Shao, Abdul Basit, Siddharth Garg, Muhammad Shafique | Published: 2025-06-27 | Updated: 2025-08-13 FrameworkLarge Language Model脱獄攻撃手法 2025.06.27 2025.08.15 Literature Database
SoK: Evaluating Jailbreak Guardrails for Large Language Models Authors: Xunguang Wang, Zhenlan Ji, Wenxuan Wang, Zongjie Li, Daoyuan Wu, Shuai Wang | Published: 2025-06-12 Prompt InjectionTrade-Off Between Safety And Usability脱獄攻撃手法 2025.06.12 2025.06.14 Literature Database