攻撃手法

Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service Detection

Authors: Leonardo Henrique de Melo, Gustavo de Carvalho Bertoli, Michele Nogueira, Aldri Luiz dos Santos, Lourenço Alves Pereira Junior | Published: 2025-03-18
サイバー脅威
データ生成手法
攻撃手法

MirrorGuard: Adaptive Defense Against Jailbreaks via Entropy-Guided Mirror Crafting

Authors: Rui Pu, Chaozhuo Li, Rui Ha, Litian Zhang, Lirong Qiu, Xi Zhang | Published: 2025-03-17
プロンプトインジェクション
大規模言語モデル
攻撃手法

Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents

Authors: Juhee Kim, Woohyuk Choi, Byoungyoung Lee | Published: 2025-03-17
インダイレクトプロンプトインジェクション
データ流分析
攻撃手法

BLIA: Detect model memorization in binary classification model through passive Label Inference attack

Authors: Mohammad Wahiduzzaman Khan, Sheng Chen, Ilya Mironov, Leizhen Zhang, Rabib Noor | Published: 2025-03-17
データキュレーション
差分プライバシー
攻撃手法

Winning the MIDST Challenge: New Membership Inference Attacks on Diffusion Models for Tabular Data Synthesis

Authors: Xiaoyu Wu, Yifei Pang, Terrance Liu, Steven Wu | Published: 2025-03-15
データ生成手法
メンバーシップ開示リスク
攻撃手法

Trust Under Siege: Label Spoofing Attacks against Machine Learning for Android Malware Detection

Authors: Tianwei Lan, Luca Demetrio, Farid Nait-Abdesselam, Yufei Han, Simone Aonzo | Published: 2025-03-14
バックドア攻撃
ラベル
攻撃手法

Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search

Authors: Andy Zhou | Published: 2025-03-13 | Updated: 2025-03-16
LLMの安全機構の解除
攻撃手法
生成モデル

Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis

Authors: Jeonghwan Park, Niall McLaughlin, Ihsen Alouani | Published: 2025-03-04 | Updated: 2025-03-16
攻撃手法
敵対的サンプルの検知
深層学習

Can Indirect Prompt Injection Attacks Be Detected and Removed?

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yufei He, Yue Liu, Yangqiu Song, Bryan Hooi | Published: 2025-02-23
プロンプトの検証
悪意のあるプロンプト
攻撃手法

Safety at Scale: A Comprehensive Survey of Large Model Safety

Authors: Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, Hengyuan Xu, Yunhao Chen, Yunhan Zhao, Hanxun Huang, Yige Li, Jiaming Zhang, Xiang Zheng, Yang Bai, Zuxuan Wu, Xipeng Qiu, Jingfeng Zhang, Yiming Li, Xudong Han, Haonan Li, Jun Sun, Cong Wang, Jindong Gu, Baoyuan Wu, Siheng Chen, Tianwei Zhang, Yang Liu, Mingming Gong, Tongliang Liu, Shirui Pan, Cihang Xie, Tianyu Pang, Yinpeng Dong, Ruoxi Jia, Yang Zhang, Shiqing Ma, Xiangyu Zhang, Neil Gong, Chaowei Xiao, Sarah Erfani, Tim Baldwin, Bo Li, Masashi Sugiyama, Dacheng Tao, James Bailey, Yu-Gang Jiang | Published: 2025-02-02 | Updated: 2025-03-19
インダイレクトプロンプトインジェクション
プロンプトインジェクション
攻撃手法