堅牢性向上手法

DefenSee: Dissecting Threat from Sight and Text – A Multi-View Defensive Pipeline for Multi-modal Jailbreaks

Authors: Zihao Wang, Kar Wai Fok, Vrizlynn L. L. Thing | Published: 2025-12-01
プロンプトインジェクション
モデルDoS
堅牢性向上手法

On the Feasibility of Hijacking MLLMs’ Decision Chain via One Perturbation

Authors: Changyue Li, Jiaying Li, Youliang Yuan, Jiaming He, Zhicong Huang, Pinjia He | Published: 2025-11-25
堅牢性向上手法
画像処理
適応型敵対的訓練

Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security

Authors: Wei Zhao, Zhe Li, Yige Li, Jun Sun | Published: 2025-11-20
プロンプトリーキング
堅牢性向上手法
生成AI向け電子透かし

FlowPure: Continuous Normalizing Flows for Adversarial Purification

Authors: Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, Vera Rimmer | Published: 2025-05-19
堅牢性向上手法
敵対的学習
防御手法の効果分析

Addressing Neural Network Robustness with Mixup and Targeted Labeling Adversarial Training

Authors: Alfred Laugros, Alice Caplier, Matthieu Ospici | Published: 2020-08-19
堅牢性向上手法
敵対的サンプル
敵対的サンプルの脆弱性

Provably robust deep generative models

Authors: Filipe Condessa, Zico Kolter | Published: 2020-04-22
堅牢性向上手法
敵対的攻撃
深層学習手法

Certifying Joint Adversarial Robustness for Model Ensembles

Authors: Mainuddin Ahmad Jonas, David Evans | Published: 2020-04-21
モデルアンサンブル
堅牢性向上手法
敵対的サンプル

Luring of transferable adversarial perturbations in the black-box paradigm

Authors: Rémi Bernhard, Pierre-Alain Moellic, Jean-Max Dutertre | Published: 2020-04-10 | Updated: 2021-03-03
堅牢性向上手法
攻撃の評価
敵対的サンプル

Adversarial Robustness for Code

Authors: Pavol Bielik, Martin Vechev | Published: 2020-02-11 | Updated: 2020-08-15
ポイズニング
堅牢性向上手法
敵対的訓練

Robustness of Bayesian Neural Networks to Gradient-Based Attacks

Authors: Ginevra Carbone, Matthew Wicker, Luca Laurenti, Andrea Patane, Luca Bortolussi, Guido Sanguinetti | Published: 2020-02-11 | Updated: 2020-06-24
ロバスト性評価
堅牢性向上手法
敵対的攻撃