PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety
Authors: Zaibin Zhang, Yongting Zhang, Lijun Li, Hongzhi Gao, Lijun Wang, Huchuan Lu, Feng Zhao, Yu Qiao, Jing Shao | Published: 2024-01-22 | Updated: 2024-08-20
Prompt Injection
Safety Alignment
Psychological Manipulation