攻撃手法

EnJa: Ensemble Jailbreak on Large Language Models

Authors: Jiahao Zhang, Zilong Wang, Ruofan Wang, Xingjun Ma, Yu-Gang Jiang | Published: 2024-08-07
プロンプトインジェクション
攻撃手法
評価手法

Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

Authors: Shaopeng Fu, Xuexue Sun, Ke Qing, Tianhang Zheng, Di Wang | Published: 2024-08-05
プライバシー保護手法
メンバーシップ推論
攻撃手法

Practical Attacks against Black-box Code Completion Engines

Authors: Slobodan Jenko, Jingxuan He, Niels Mündler, Mark Vero, Martin Vechev | Published: 2024-08-05
攻撃手法
脆弱性管理
評価手法

Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents

Authors: Yulong Yang, Xinshan Yang, Shuaidong Li, Chenhao Lin, Zhengyu Zhao, Chao Shen, Tianwei Zhang | Published: 2024-07-12 | Updated: 2025-03-16
インダイレクトプロンプトインジェクション
攻撃手法
脆弱性攻撃手法

TPIA: Towards Target-specific Prompt Injection Attack against Code-oriented Large Language Models

Authors: Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren, Chun Chen | Published: 2024-07-12 | Updated: 2025-01-16
LLMセキュリティ
プロンプトインジェクション
攻撃手法

MALT Powers Up Adversarial Attacks

Authors: Odelia Melamed, Gilad Yehudai, Adi Shamir | Published: 2024-07-02
メソスコピック線形性
攻撃手法
評価手法

Can Go AIs be adversarially robust?

Authors: Tom Tseng, Euan McLean, Kellin Pelrine, Tony T. Wang, Adam Gleave | Published: 2024-06-18 | Updated: 2025-01-14
モデル性能評価
攻撃手法
透かし評価

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao | Published: 2024-06-18 | Updated: 2025-01-14
データプライバシー評価
フレームワーク
攻撃手法

Knowledge Return Oriented Prompting (KROP)

Authors: Jason Martin, Kenneth Yeung | Published: 2024-06-11
LLMセキュリティ
プロンプトインジェクション
攻撃手法

Model for Peanuts: Hijacking ML Models without Training Access is Possible

Authors: Mahmoud Ghorbel, Halima Bouzidi, Ioan Marius Bilasco, Ihsen Alouani | Published: 2024-06-03
メンバーシップ推論
攻撃手法
顔認識システム