QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety Authors: Taegyeong Lee, Jeonghwa Yoo, Hyoungseo Cho, Soo Yong Kim, Yunho Maeng | Published: 2025-06-14 | Updated: 2025-09-30 AlignmentEthical StatementMalicious Prompt 2025.06.14 2025.10.02 Literature Database
IF-GUIDE: Influence Function-Guided Detoxification of LLMs Authors: Zachary Coalson, Juhan Bae, Nicholas Carlini, Sanghyun Hong | Published: 2025-06-02 | Updated: 2025-06-09 Text DetoxificationEthical Statement影響関数 2025.06.02 2025.06.11 Literature Database
Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents Authors: Rongwu Xu, Xiaojian Li, Shuo Chen, Wei Xu | Published: 2025-02-17 | Updated: 2025-03-23 Indirect Prompt InjectionEthical StatementDecision-Making Dynamics 2025.02.17 2025.05.27 Literature Database