PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks
Authors: Guobin Shen, Dongcheng Zhao, Linghao Feng, Xiang He, Jihang Wang, Sicheng Shen, Haibo Tong, Yiting Dong, Jindong Li, Xiang Zheng, Yi Zeng | Published: 2025-05-20 | Updated: 2025-05-22
Disabling Safety Mechanisms of LLM
Prompt Injection
Effectiveness Analysis of Defense Methods