倫理基準遵守

Enabling Regulatory Multi-Agent Collaboration: Architecture, Challenges, and Solutions

Authors: Qinnan Hu, Yuntao Wang, Yuan Gao, Zhou Su, Linkang Du | Published: 2025-09-11
AIシステムの関係性
倫理基準遵守
異常検知手法

Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs

Authors: Yu Yan, Sheng Sun, Zhe Wang, Yijun Lin, Zenghao Duan, zhifei zheng, Min Liu, Zhiyi yin, Jianping Zhang | Published: 2025-08-22 | Updated: 2025-09-15
プライバシー評価
倫理基準遵守
大規模言語モデル

Rethinking Exact Unlearning under Exposure: Extracting Forgotten Data under Exact Unlearning in Large Language Model

Authors: Xiaoyu Wu, Yifei Pang, Terrance Liu, Zhiwei Steven Wu | Published: 2025-05-30 | Updated: 2025-10-06
プライバシー保護機械学習
プライバシー損失分析
倫理基準遵守

Adversarial Suffix Filtering: a Defense Pipeline for LLMs

Authors: David Khachaturov, Robert Mullins | Published: 2025-05-14
プロンプトの検証
倫理基準遵守
攻撃検出手法