攻撃手法

TPIA: Towards Target-specific Prompt Injection Attack against Code-oriented Large Language Models

Authors: Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren, Chun Chen | Published: 2024-07-12 | Updated: 2025-01-16
LLMセキュリティ
プロンプトインジェクション
攻撃手法

MALT Powers Up Adversarial Attacks

Authors: Odelia Melamed, Gilad Yehudai, Adi Shamir | Published: 2024-07-02
メソスコピック線形性
攻撃手法
評価手法

Can Go AIs be adversarially robust?

Authors: Tom Tseng, Euan McLean, Kellin Pelrine, Tony T. Wang, Adam Gleave | Published: 2024-06-18 | Updated: 2025-01-14
モデル性能評価
攻撃手法
透かし評価

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao | Published: 2024-06-18 | Updated: 2025-01-14
データプライバシー評価
フレームワーク
攻撃手法

Knowledge Return Oriented Prompting (KROP)

Authors: Jason Martin, Kenneth Yeung | Published: 2024-06-11
LLMセキュリティ
プロンプトインジェクション
攻撃手法

Model for Peanuts: Hijacking ML Models without Training Access is Possible

Authors: Mahmoud Ghorbel, Halima Bouzidi, Ioan Marius Bilasco, Ihsen Alouani | Published: 2024-06-03
メンバーシップ推論
攻撃手法
顔認識システム

Constrained Adaptive Attack: Effective Adversarial Attack Against Deep Neural Networks for Tabular Data

Authors: Thibault Simonetto, Salah Ghamizi, Maxime Cordy | Published: 2024-06-02
CAPGDアルゴリズム
攻撃手法
敵対的訓練

Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks

Authors: Chen Xiong, Xiangyu Qi, Pin-Yu Chen, Tsung-Yi Ho | Published: 2024-05-30 | Updated: 2025-06-04
DPPセット生成
プロンプトインジェクション
攻撃手法

Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

Authors: Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu | Published: 2024-05-29
アルゴリズム
攻撃手法
最適化問題

Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

Authors: Xijie Huang, Xinyuan Wang, Hantao Zhang, Yinghao Zhu, Jiawen Xi, Jingkun An, Hao Wang, Hao Liang, Chengwei Pan | Published: 2024-05-26 | Updated: 2024-08-21
プロンプトインジェクション
医療AIの脅威
攻撃手法