法執行回避

PRISON: Unmasking the Criminal Potential of Large Language Models

Authors: Xinyi Wu, Geng Hong, Pei Chen, Yueyue Chen, Xudong Pan, Min Yang | Published: 2025-06-19 | Updated: 2025-08-04
LLMの安全機構の解除
法執行回避
研究方法論

From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks

Authors: Zhexin Zhang, Junxiao Yang, Yida Lu, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang | Published: 2024-07-03 | Updated: 2025-05-20
プロンプトインジェクション
大規模言語モデル
法執行回避