文献データベース

Cyber Defense Reinvented: Large Language Models as Threat Intelligence Copilots

Authors: Xiaoqun Liu, Jiacheng Liang, Qiben Yan, Jiyong Jang, Sicheng Mao, Muchao Ye, Jinyuan Jia, Zhaohan Xi | Published: 2025-02-28 | Updated: 2025-04-16

サイバー脅威インテリジェンス

プロンプトリーキング

モデル抽出攻撃

2025.02.28

文献データベース

Models That Are Interpretable But Not Transparent

Authors: Chudi Zhong, Panyu Chen, Cynthia Rudin | Published: 2025-02-26

モデル情報を秘匿しつつ、説明性を提供する手法

情報セキュリティ

攻撃者の行動分析

2025.02.26 2025.04.03

文献データベース

Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs

Authors: Shiyu Xiang, Ansen Zhang, Yanfei Cao, Yang Fan, Ronghao Chen | Published: 2025-02-26 | Updated: 2025-05-28

LLMセキュリティ

プロンプトインジェクション

攻撃の評価

2025.02.26

文献データベース

Evaluating Membership Inference Attacks in heterogeneous-data setups

Authors: Bram van Dartel, Marc Damie, Florian Hahn | Published: 2025-02-26 | Updated: 2025-04-28

データセット生成

プライバシー保護

攻撃タイプ

2025.02.26

文献データベース

Detecting Benchmark Contamination Through Watermarking

Authors: Tom Sander, Pierre Fernandez, Saeed Mahloujifar, Alain Durmus, Chuan Guo | Published: 2025-02-24 | Updated: 2025-07-21

ウォーターマーキング

データ汚染検出

性能評価

2025.02.24

文献データベース

GuidedBench: Measuring and Mitigating the Evaluation Discrepancies of In-the-wild LLM Jailbreak Methods

Authors: Ruixuan Huang, Xunguang Wang, Zongjie Li, Daoyuan Wu, Shuai Wang | Published: 2025-02-24 | Updated: 2025-07-09

プロンプトインジェクション

脱獄手法

評価手法

2025.02.24

文献データベース

Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System

Authors: Saikat Barua, Mostafizur Rahman, Md Jafor Sadek, Rafiul Islam, Shehenaz Khaled, Ahmedul Kabir | Published: 2025-02-23 | Updated: 2025-06-12

プロンプトインジェクション

多エージェントシステムの評価

敵対的攻撃評価

2025.02.23

文献データベース

Can Indirect Prompt Injection Attacks Be Detected and Removed?

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yufei He, Yue Liu, Yangqiu Song, Bryan Hooi | Published: 2025-02-23

プロンプトの検証

悪意のあるプロンプト

攻撃手法

2025.02.23 2025.04.03

文献データベース

Robustness and Cybersecurity in the EU Artificial Intelligence Act

Authors: Henrik Nolte, Miriam Rateike, Michèle Finck | Published: 2025-02-22 | Updated: 2025-05-28

フェアネス学習

ロバストな説明可能性

規制の重要性

2025.02.22

文献データベース

Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

Authors: Ivoline Ngong, Swanand Kadhe, Hao Wang, Keerthiram Murugesan, Justin D. Weisz, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy | Published: 2025-02-22 | Updated: 2025-07-28

プライバシーリスク管理

プロンプトリーキング

透かし評価

2025.02.22

文献データベース