PRISON: Unmasking the Criminal Potential of Large Language Models Authors: Xinyi Wu, Geng Hong, Pei Chen, Yueyue Chen, Xudong Pan, Min Yang | Published: 2025-06-19 | Updated: 2025-08-04 LLMの安全機構の解除法執行回避研究方法論 2025.06.19 文献データベース
From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks Authors: Zhexin Zhang, Junxiao Yang, Yida Lu, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang | Published: 2024-07-03 | Updated: 2025-05-20 プロンプトインジェクション大規模言語モデル法執行回避 2024.07.03 文献データベース