防御手法

Input Reconstruction Attack against Vertical Federated Large Language Models

Authors: Fei Zheng | Published: 2023-11-07 | Updated: 2023-11-24
VFLの特性
プライバシー保護
防御手法

PubDef: Defending Against Transfer Attacks From Public Models

Authors: Chawin Sitawarin, Jaewon Chang, David Huang, Wesson Altoyan, David Wagner | Published: 2023-10-26 | Updated: 2024-03-17
敵対的攻撃
敵対的訓練
防御手法

A Cautionary Tale: On the Role of Reference Data in Empirical Privacy Defenses

Authors: Caelin G. Kaplan, Chuan Xu, Othmane Marfoq, Giovanni Neglia, Anderson Santana de Oliveira | Published: 2023-10-18
プライバシー保護手法
プライバシー手法
防御手法

Assessing Robustness via Score-Based Adversarial Image Generation

Authors: Marcel Kollovieh, Lukas Gosch, Yan Scholten, Marten Lienen, Stephan Günnemann | Published: 2023-10-06
データ生成
実験的検証
防御手法

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

Authors: Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas | Published: 2023-10-05 | Updated: 2024-06-11
LLM性能評価
プロンプトインジェクション
防御手法

Breaking On-Chip Communication Anonymity using Flow Correlation Attacks

Authors: Hansika Weerasena, Prabhat Mishra | Published: 2023-09-27 | Updated: 2024-02-01
性能評価
流量相関攻撃
防御手法

How Robust is Google’s Bard to Adversarial Image Attacks?

Authors: Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, Jun Zhu | Published: 2023-09-21 | Updated: 2023-10-14
敵対的訓練
防御手法
顔認識

Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM

Authors: Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen | Published: 2023-09-18 | Updated: 2024-06-12
プロンプトインジェクション
安全性アライメント
防御手法

DAD++: Improved Data-free Test Time Adversarial Defense

Authors: Gaurav Kumar Nayak, Inder Khatri, Shubham Randive, Ruchit Rawal, Anirban Chakraborty | Published: 2023-09-10
敵対的サンプル
敵対的攻撃
防御手法

Adversarially Robust Deep Learning with Optimal-Transport-Regularized Divergences

Authors: Jeremiah Birrell, Mohammadreza Ebrahimi | Published: 2023-09-07
悪意のあるデモ構築
敵対的攻撃
防御手法