PandaGuard: Systematic Evaluation of LLM Safety in the Era of Jailbreaking Attacks
Authors: Guobin Shen, Dongcheng Zhao, Linghao Feng, Xiang He, Jihang Wang, Sicheng Shen, Haibo Tong, Yiting Dong, Jindong Li, Xiang Zheng, Yi Zeng | Published: 2025-05-20
Disabling Safety Mechanisms of LLM
Prompt Injection
Effectiveness Analysis of Defense Methods