敵対的攻撃手法

Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

Authors: Ziqi Miao, Yi Ding, Lijun Li, Jing Shao | Published: 2025-07-03
プロンプトインジェクション
モデルDoS
敵対的攻撃手法

Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models

Authors: Arjun Krishna, Aaditya Rastogi, Erick Galinkin | Published: 2025-06-16
プロンプトインジェクション
大規模言語モデル
敵対的攻撃手法

TokenBreak: Bypassing Text Classification Models Through Token Manipulation

Authors: Kasimir Schulz, Kenneth Yeung, Kieran Evans | Published: 2025-06-09
敵対的攻撃手法
防御手法

Enhancing Adversarial Robustness with Conformal Prediction: A Framework for Guaranteed Model Reliability

Authors: Jie Bao, Chuangyin Dang, Rui Luo, Hanwei Zhang, Zhixin Zhou | Published: 2025-06-09
モデルの頑健性保証
ロバスト最適化
敵対的攻撃手法

A Review of Adversarial Attacks in Computer Vision

Authors: Yutong Zhang, Yao Li, Yin Li, Zhichang Guo | Published: 2023-08-15
ポイズニング
敵対的攻撃手法
防御手法

Pelta: Shielding Transformers to Mitigate Evasion Attacks in Federated Learning

Authors: Simon Queyrut, Yérom-David Bromberg, Valerio Schiavoni | Published: 2023-08-08
ウォーターマーキング
敵対的攻撃手法
防御手法

A reading survey on adversarial machine learning: Adversarial attacks and their understanding

Authors: Shashank Kotyan | Published: 2023-08-07
敵対的サンプル
敵対的攻撃手法
防御手法

Label Inference Attacks against Node-level Vertical Federated GNNs

Authors: Marco Arazzi, Mauro Conti, Stefanos Koffas, Marina Krcek, Antonino Nocera, Stjepan Picek, Jing Xu | Published: 2023-08-04 | Updated: 2024-04-18
ポイズニング
敵対的攻撃手法
連合学習

Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples

Authors: Shaokui Wei, Mingda Zhang, Hongyuan Zha, Baoyuan Wu | Published: 2023-07-20
バックドア攻撃
敵対的攻撃手法
透かし評価

Jailbroken: How Does LLM Safety Training Fail?

Authors: Alexander Wei, Nika Haghtalab, Jacob Steinhardt | Published: 2023-07-05
セキュリティ保証
プロンプトインジェクション
敵対的攻撃手法