攻撃手法

Can Go AIs be adversarially robust?

Authors: Tom Tseng, Euan McLean, Kellin Pelrine, Tony T. Wang, Adam Gleave | Published: 2024-06-18 | Updated: 2025-01-14
モデル性能評価
攻撃手法
透かし評価

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao | Published: 2024-06-18 | Updated: 2025-01-14
データプライバシー評価
フレームワーク
攻撃手法

Knowledge Return Oriented Prompting (KROP)

Authors: Jason Martin, Kenneth Yeung | Published: 2024-06-11
LLMセキュリティ
プロンプトインジェクション
攻撃手法

Model for Peanuts: Hijacking ML Models without Training Access is Possible

Authors: Mahmoud Ghorbel, Halima Bouzidi, Ioan Marius Bilasco, Ihsen Alouani | Published: 2024-06-03
メンバーシップ推論
攻撃手法
顔認識システム

Constrained Adaptive Attack: Effective Adversarial Attack Against Deep Neural Networks for Tabular Data

Authors: Thibault Simonetto, Salah Ghamizi, Maxime Cordy | Published: 2024-06-02
CAPGDアルゴリズム
攻撃手法
敵対的訓練

Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

Authors: Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu | Published: 2024-05-29
アルゴリズム
攻撃手法
最適化問題

Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

Authors: Xijie Huang, Xinyuan Wang, Hantao Zhang, Yinghao Zhu, Jiawen Xi, Jingkun An, Hao Wang, Hao Liang, Chengwei Pan | Published: 2024-05-26 | Updated: 2024-08-21
プロンプトインジェクション
医療AIの脅威
攻撃手法

Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character

Authors: Siyuan Ma, Weidi Luo, Yu Wang, Xiaogeng Liu | Published: 2024-05-25 | Updated: 2024-06-12
LLMセキュリティ
プロンプトインジェクション
攻撃手法

A novel reliability attack of Physical Unclonable Functions

Authors: Gaoxiang Li, Yu Zhuang | Published: 2024-05-21 | Updated: 2024-06-07
FPGA
実験的検証
攻撃手法

GAN-GRID: A Novel Generative Attack on Smart Grid Stability Prediction

Authors: Emad Efatinasab, Alessandro Brighente, Mirco Rampazzo, Nahal Azadi, Mauro Conti | Published: 2024-05-20
モデル性能評価
攻撃の評価
攻撃手法