攻撃手法 | ページ 2 | AIセキュリティポータル

Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks

Authors: Yixin Cheng, Hongcheng Guo, Yangming Li, Leonid Sigal | Published: 2025-05-08

プロンプトリーキング

攻撃手法

透かし技術

2025.05.08

文献データベース

Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks

Authors: Yixin Cheng, Hongcheng Guo, Yangming Li, Leonid Sigal | Published: 2025-05-08

プロンプトリーキング

攻撃手法

透かし技術

2025.05.08 2025.05.27

文献データベース

ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models

Authors: Jin Xie, Ruishi He, Songze Li, Xiaojun Jia, Shouling Ji | Published: 2025-04-29

バックドアモデルの検知

プライバシー侵害

攻撃手法

2025.04.29

文献データベース

Token-Efficient Prompt Injection Attack: Provoking Cessation in LLM Reasoning via Adaptive Token Compression

Authors: Yu Cui, Yujun Cai, Yiwei Wang | Published: 2025-04-29

トークン圧縮フレームワーク

プロンプトインジェクション

攻撃手法

2025.04.29

文献データベース

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yue Liu, Yufei He, Yangqiu Song, Bryan Hooi | Published: 2025-04-29

インダイレクトプロンプトインジェクション

プロンプトの検証

攻撃手法

2025.04.29

文献データベース

Enhancing Leakage Attacks on Searchable Symmetric Encryption Using LLM-Based Synthetic Data Generation

Authors: Joshua Chiu, Partha Protim Paul, Zahin Wahab | Published: 2025-04-29

インダイレクトプロンプトインジェクション

攻撃手法

階層クラスタリング

2025.04.29

文献データベース

The Automation Advantage in AI Red Teaming

Authors: Rob Mulla, Will Pearce, Nick Landers, Brian Greunke, Brad Palm, Vincent Abruzzo, Ads Dawson | Published: 2025-04-28

プロンプトリーキング

攻撃手法

自動化の効果

2025.04.28

文献データベース

BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts

Authors: Qingyue Wang, Qi Pang, Xixun Lin, Shuai Wang, Daoyuan Wu | Published: 2025-04-24 | Updated: 2025-04-29

RAGへのポイズニング攻撃

バックドア攻撃手法

攻撃手法

2025.04.24

文献データベース

NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect

Authors: Yicheng Zhang, Ravan Nazaraliyev, Sankha Baran Dutta, Andres Marquez, Kevin Barker, Nael Abu-Ghazaleh | Published: 2025-03-22

クラウドコンピューティング

サイドチャネル攻撃

攻撃手法

2025.03.22 2025.04.03

文献データベース

Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings

Authors: Zonghao Ying, Guangyi Zheng, Yongxin Huang, Deyue Zhang, Wenxin Zhang, Quanchen Zou, Aishan Liu, Xianglong Liu, Dacheng Tao | Published: 2025-03-19

プロンプトインジェクション

大規模言語モデル

攻撃手法

2025.03.19 2025.04.03

文献データベース