文献データベース

Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking

Authors: Yijie Xu, Aiwei Liu, Xuming Hu, Lijie Wen, Hui Xiong | Published: 2025-03-06 | Updated: 2025-03-15
生成AI向け電子透かし
生成モデル
透かし除去技術

Benchmarking LLMs and LLM-based Agents in Practical Vulnerability Detection for Code Repositories

Authors: Alperen Yildiz, Sin G. Teo, Yiling Lou, Yebo Feng, Chong Wang, Dinil M. Divakaran | Published: 2025-03-05 | Updated: 2025-03-18
インダイレクトプロンプトインジェクション
深層学習
脆弱性検出

SpinML: Customized Synthetic Data Generation for Private Training of Specialized ML Models

Authors: Jiang Zhang, Rohan Xavier Sequeira, Konstantinos Psounis | Published: 2025-03-05 | Updated: 2025-04-07
プライバシー保護
モデル性能評価
差分プライバシー

Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis

Authors: Jeonghwan Park, Niall McLaughlin, Ihsen Alouani | Published: 2025-03-04 | Updated: 2025-03-16
攻撃手法
敵対的サンプルの検知
深層学習

TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions

Authors: Wang YuHang, Junkang Guo, Aolei Liu, Kaihao Wang, Zaitong Wu, Zhenyu Liu, Wenfei Yin, Jian Liu | Published: 2025-03-02 | Updated: 2025-03-21
ロバスト性
敵対的学習
敵対的訓練

Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems

Authors: Song Xia, Yi Yu, Wenhan Yang, Meiwen Ding, Zhuo Chen, Ling-Yu Duan, Alex C. Kot, Xudong Jiang | Published: 2025-03-01 | Updated: 2025-04-03
プライバシー保護
モデルの頑健性保証
モデル性能評価

Cyber Defense Reinvented: Large Language Models as Threat Intelligence Copilots

Authors: Xiaoqun Liu, Jiacheng Liang, Qiben Yan, Jiyong Jang, Sicheng Mao, Muchao Ye, Jinyuan Jia, Zhaohan Xi | Published: 2025-02-28 | Updated: 2025-04-16
サイバー脅威インテリジェンス
プロンプトリーキング
モデル抽出攻撃

Models That Are Interpretable But Not Transparent

Authors: Chudi Zhong, Panyu Chen, Cynthia Rudin | Published: 2025-02-26
モデル情報を秘匿しつつ、説明性を提供する手法
情報セキュリティ
攻撃者の行動分析

Can Indirect Prompt Injection Attacks Be Detected and Removed?

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yufei He, Yue Liu, Yangqiu Song, Bryan Hooi | Published: 2025-02-23
プロンプトの検証
悪意のあるプロンプト
攻撃手法

Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents

Authors: Rongwu Xu, Xiaojian Li, Shuo Chen, Wei Xu | Published: 2025-02-17 | Updated: 2025-03-23
インダイレクトプロンプトインジェクション
倫理声明
意思決定ダイナミクス