防御手法

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

Authors: Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas | Published: 2023-10-05 | Updated: 2024-06-11
LLM性能評価
プロンプトインジェクション
防御手法

Breaking On-Chip Communication Anonymity using Flow Correlation Attacks

Authors: Hansika Weerasena, Prabhat Mishra | Published: 2023-09-27 | Updated: 2024-02-01
性能評価
流量相関攻撃
防御手法

How Robust is Google’s Bard to Adversarial Image Attacks?

Authors: Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, Jun Zhu | Published: 2023-09-21 | Updated: 2023-10-14
敵対的訓練
防御手法
顔認識

Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM

Authors: Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen | Published: 2023-09-18 | Updated: 2024-06-12
プロンプトインジェクション
安全性アライメント
防御手法

DAD++: Improved Data-free Test Time Adversarial Defense

Authors: Gaurav Kumar Nayak, Inder Khatri, Shubham Randive, Ruchit Rawal, Anirban Chakraborty | Published: 2023-09-10
敵対的サンプル
敵対的攻撃
防御手法

Adversarially Robust Deep Learning with Optimal-Transport-Regularized Divergences

Authors: Jeremiah Birrell, Mohammadreza Ebrahimi | Published: 2023-09-07
悪意のあるデモ構築
敵対的攻撃
防御手法

Protect Federated Learning Against Backdoor Attacks via Data-Free Trigger Generation

Authors: Yanxin Yang, Ming Hu, Yue Cao, Jun Xia, Yihao Huang, Yang Liu, Mingsong Chen | Published: 2023-08-22
バックドア攻撃
ポイズニング
防御手法

A Review of Adversarial Attacks in Computer Vision

Authors: Yutong Zhang, Yao Li, Yin Li, Zhichang Guo | Published: 2023-08-15
ポイズニング
敵対的攻撃手法
防御手法

SoK: Realistic Adversarial Attacks and Defenses for Intelligent Network Intrusion Detection

Authors: João Vitorino, Isabel Praça, Eva Maia | Published: 2023-08-13
バックドア攻撃
敵対的訓練
防御手法

Pelta: Shielding Transformers to Mitigate Evasion Attacks in Federated Learning

Authors: Simon Queyrut, Yérom-David Bromberg, Valerio Schiavoni | Published: 2023-08-08
ウォーターマーキング
敵対的攻撃手法
防御手法