モデルの堅牢性

TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic Representation

Authors: Tianyu Cui, Xinjie Lin, Sijia Li, Miao Chen, Qilei Yin, Qi Li, Ke Xu | Published: 2025-04-05 | Updated: 2025-04-15
LLM性能評価
タスク特化型チューニング
モデルの堅牢性

Robust LLM safeguarding via refusal feature adversarial training

Authors: Lei Yu, Virginie Do, Karen Hambardzumyan, Nicola Cancedda | Published: 2024-09-30 | Updated: 2025-03-20
プロンプトインジェクション
モデルの堅牢性
敵対的学習

Stealing Part of a Production Language Model

Authors: Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Itay Yona, Eric Wallace, David Rolnick, Florian Tramèr | Published: 2024-03-11 | Updated: 2024-07-09
プロンプトリーキング
モデルの堅牢性
モデル抽出攻撃

Data Reconstruction Attacks and Defenses: A Systematic Evaluation

Authors: Sheng Liu, Zihan Wang, Yuxiao Chen, Qi Lei | Published: 2024-02-13 | Updated: 2025-03-22
プライバシー分析
モデルの堅牢性
敵対的攻撃

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

Authors: Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, Dimitris Papailiopoulos | Published: 2020-07-09
ポイズニング
モデルの堅牢性
攻撃手法

A Fast Saddle-Point Dynamical System Approach to Robust Deep Learning

Authors: Yasaman Esfandiari, Aditya Balu, Keivan Ebrahimi, Umesh Vaidya, Nicola Elia, Soumik Sarkar | Published: 2019-10-18 | Updated: 2021-03-01
モデルの堅牢性
対抗的学習
敵対的サンプル

Mapper Based Classifier

Authors: Jacek Cyranka, Alexander Georges, David Meyer | Published: 2019-10-17 | Updated: 2019-10-21
モデルの堅牢性
深層学習
生成モデル

Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets

Authors: Yogesh Balaji, Tom Goldstein, Judy Hoffman | Published: 2019-10-17
モデルの堅牢性
対抗的学習
敵対的サンプル

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Authors: Tao Yu, Shengyuan Hu, Chuan Guo, Wei-Lun Chao, Kilian Q. Weinberger | Published: 2019-10-16 | Updated: 2019-12-04
モデルの堅牢性
対抗的学習
敵対的攻撃検出

MUTE: Data-Similarity Driven Multi-hot Target Encoding for Neural Network Design

Authors: Mayoore S. Jaiswal, Bumsoo Kang, Jinho Lee, Minsik Cho | Published: 2019-10-15
モデルの堅牢性
深層学習