防御手法

Random Gradient Masking as a Defensive Measure to Deep Leakage in Federated Learning

Authors: Joon Kim, Sejin Park | Published: 2024-08-15
ウォーターマーキング
ポイズニング
防御手法

Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks

Authors: Jiawei Zhao, Kejiang Chen, Xiaojian Yuan, Weiming Zhang | Published: 2024-08-15 | Updated: 2024-08-22
LLMセキュリティ
プロンプトインジェクション
防御手法

Counter Denial of Service for Next-Generation Networks within the Artificial Intelligence and Post-Quantum Era

Authors: Saleh Darzi, Attila A. Yavuz | Published: 2024-08-08
DoS対策
プライバシー保護手法
防御手法

Simple Perturbations Subvert Ethereum Phishing Transactions Detection: An Empirical Analysis

Authors: Ahod Alghureid, David Mohaisen | Published: 2024-08-06
フィッシング検出
モデル性能評価
防御手法

Mitigating Malicious Attacks in Federated Learning via Confidence-aware Defense

Authors: Qilei Li, Ahmed M. Abdelmoniem | Published: 2024-08-05 | Updated: 2024-08-16
DoS対策
ポイズニング
防御手法

OTAD: An Optimal Transport-Induced Robust Model for Agnostic Adversarial Attack

Authors: Kuo Gai, Sicong Wang, Shihua Zhang | Published: 2024-08-01
敵対的訓練
最適化問題
防御手法

Variational Randomized Smoothing for Sample-Wise Adversarial Robustness

Authors: Ryo Hase, Ye Wang, Toshiaki Koike-Akino, Jing Liu, Kieran Parsons | Published: 2024-07-16
正則化
透かしの耐久性
防御手法

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Authors: Edoardo Debenedetti, Javier Rando, Daniel Paleka, Silaghi Fineas Florin, Dragos Albastroiu, Niv Cohen, Yuval Lemberg, Reshmi Ghosh, Rui Wen, Ahmed Salem, Giovanni Cherubin, Santiago Zanella-Beguelin, Robin Schmid, Victor Klemm, Takahiro Miki, Chenhao Li, Stefan Kraft, Mario Fritz, Florian Tramèr, Sahar Abdelnabi, Lea Schönherr | Published: 2024-06-12
LLMセキュリティ
プロンプトインジェクション
防御手法

A Study of Backdoors in Instruction Fine-tuned Language Models

Authors: Jayaram Raghuram, George Kesidis, David J. Miller | Published: 2024-06-12 | Updated: 2024-08-21
LLMセキュリティ
バックドア攻撃
防御手法

AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens

Authors: Lin Lu, Hai Yan, Zenghui Yuan, Jiawen Shi, Wenqi Wei, Pin-Yu Chen, Pan Zhou | Published: 2024-06-06
LLM性能評価
プロンプトインジェクション
防御手法