テキストデトキシフィケーション

IF-GUIDE: Influence Function-Guided Detoxification of LLMs

Authors: Zachary Coalson, Juhan Bae, Nicholas Carlini, Sanghyun Hong | Published: 2025-06-02 | Updated: 2025-06-09

テキストデトキシフィケーション

倫理声明

影響関数

2025.06.02

文献データベース

You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content

Authors: Xinlei He, Savvas Zannettou, Yun Shen, Yang Zhang | Published: 2023-08-10

テキストデトキシフィケーション

プロンプトリーキング

出力の有害度の算出

2023.08.10 2025.04.03

文献データベース