防御手法の統合

Proactive defense against LLM Jailbreak

Authors: Weiliang Zhao, Jinjun Peng, Daniel Ben-Levi, Zhou Yu, Junfeng Yang | Published: 2025-10-06
LLMの安全機構の解除
プロンプトインジェクション
防御手法の統合

Unified Threat Detection and Mitigation Framework (UTDMF): Combating Prompt Injection, Deception, and Bias in Enterprise-Scale Transformers

Authors: Santhosh KumarRavindran | Published: 2025-10-06
インダイレクトプロンプトインジェクション
バイアス緩和手法
防御手法の統合

P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs

Authors: Shuai Zhao, Xinyi Wu, Shiqian Zhao, Xiaobao Wu, Zhongliang Guo, Yanhao Jia, Anh Tuan Luu | Published: 2025-10-06
プロンプトインジェクション
プロンプトの検証
防御手法の統合

UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models

Authors: Yuhao Sun, Zhuoer Xu, Shiwen Cui, Kun Yang, Lingyun Yu, Yongdong Zhang, Hongtao Xie | Published: 2025-10-02
AIシステムの関係性
学習の改善
防御手法の統合

A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives

Authors: Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, Yushun Dong | Published: 2025-08-20 | Updated: 2025-08-27
モデル抽出攻撃
知的財産保護
防御手法の統合

Combining Machine Learning Defenses without Conflicts

Authors: Vasisht Duddu, Rui Zhang, N. Asokan | Published: 2024-11-14 | Updated: 2025-08-14
モデルの頑健性保証
透かし評価
防御手法の統合