IF-GUIDE: Influence Function-Guided Detoxification of LLMs Authors: Zachary Coalson, Juhan Bae, Nicholas Carlini, Sanghyun Hong | Published: 2025-06-02 | Updated: 2025-06-09 Text DetoxificationEthical Statement影響関数 2025.06.02 2025.06.11 Literature Database
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content Authors: Xinlei He, Savvas Zannettou, Yun Shen, Yang Zhang | Published: 2023-08-10 Text DetoxificationPrompt leakingCalculation of Output Harmfulness 2023.08.10 2025.05.28 Literature Database