敵対的攻撃

Fundamental Limits of Membership Inference Attacks on Machine Learning Models

Authors: Eric Aubinais, Elisabeth Gassiat, Pablo Piantanida | Published: 2023-10-20 | Updated: 2024-06-11
メンバーシップ推論
敵対的攻撃
機械学習手法

An LLM can Fool Itself: A Prompt-Based Adversarial Attack

Authors: Xilie Xu, Keyi Kong, Ning Liu, Lizhen Cui, Di Wang, Jingfeng Zhang, Mohan Kankanhalli | Published: 2023-10-20
プロンプトインジェクション
悪意のあるプロンプト
敵対的攻撃

On existence, uniqueness and scalability of adversarial robustness measures for AI classifiers

Authors: Illia Horenko | Published: 2023-10-19 | Updated: 2023-11-15
敵対的攻撃
最適化手法
機械学習手法

Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

Authors: Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen | Published: 2023-10-10
プロンプトインジェクション
攻撃の評価
敵対的攻撃

Outlier Robust Adversarial Training

Authors: Shu Hu, Zhenhuan Yang, Xin Wang, Yiming Ying, Siwei Lyu | Published: 2023-09-10
収束特性
損失項
敵対的攻撃

DAD++: Improved Data-free Test Time Adversarial Defense

Authors: Gaurav Kumar Nayak, Inder Khatri, Shubham Randive, Ruchit Rawal, Anirban Chakraborty | Published: 2023-09-10
敵対的サンプル
敵対的攻撃
防御手法

Adversarially Robust Deep Learning with Optimal-Transport-Regularized Divergences

Authors: Jeremiah Birrell, Mohammadreza Ebrahimi | Published: 2023-09-07
悪意のあるデモ構築
敵対的攻撃
防御手法

Non-Asymptotic Bounds for Adversarial Excess Risk under Misspecified Models

Authors: Changyu Liu, Yuling Jiao, Junhui Wang, Jian Huang | Published: 2023-09-02
収束特性
損失項
敵対的攻撃

The Power of MEME: Adversarial Malware Creation with Model-Based Reinforcement Learning

Authors: Maria Rigaki, Sebastian Garcia | Published: 2023-08-31
強化学習
悪意のあるデモ構築
敵対的攻撃

A Comparison of Adversarial Learning Techniques for Malware Detection

Authors: Pavla Louthánová, Matouš Kozák, Martin Jureček, Mark Stamp | Published: 2023-08-19
マルウェア検出
敵対的サンプル
敵対的攻撃