文献データベース

Prompt Stealing Attacks Against Large Language Models

Authors: Zeyang Sha, Yang Zhang | Published: 2024-02-20
LLMセキュリティ
プロンプトインジェクション
プロンプトエンジニアリング

Bounding Reconstruction Attack Success of Adversaries Without Data Priors

Authors: Alexander Ziller, Anneliese Riess, Kristian Schwethelm, Tamara T. Mueller, Daniel Rueckert, Georgios Kaissis | Published: 2024-02-20
データプライバシー評価
プライバシー保護手法
透かし評価

APT-MMF: An advanced persistent threat actor attribution method based on multimodal and multilevel feature fusion

Authors: Nan Xiao, Bo Lang, Ting Wang, Yikai Chen | Published: 2024-02-20
GNN
IoC解析手法
自動化された脅威帰属

Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors

Authors: Yiwei Lu, Matthew Y. R. Yang, Gautam Kamath, Yaoliang Yu | Published: 2024-02-20
バックドア攻撃
ポイズニング
転移学習

An Adversarial Approach to Evaluating the Robustness of Event Identification Models

Authors: Obai Bahwal, Oliver Kosut, Lalitha Sankar | Published: 2024-02-19 | Updated: 2024-04-22
イベント識別
ロバスト性評価

Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

Authors: Christian Schlarmann, Naman Deep Singh, Francesco Croce, Matthias Hein | Published: 2024-02-19 | Updated: 2024-06-05
プロンプトインジェクション
ロバスト性評価
敵対的訓練

CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation

Authors: Jueon Eom, Seyeon Jeong, Taekyoung Kwon | Published: 2024-02-19
ファジング
強化学習
評価手法

Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning

Authors: Shuai Zhao, Leilei Gan, Luu Anh Tuan, Jie Fu, Lingjuan Lyu, Meihuizi Jia, Jinming Wen | Published: 2024-02-19 | Updated: 2024-03-29
バックドアモデルの検知
攻撃手法
防御手法

Federated Bayesian Network Ensembles

Authors: Florian van Daalen, Lianne Ippel, Andre Dekker, Inigo Bermejo | Published: 2024-02-19
ベイズ分類
モデル設計
連合学習

Manipulating hidden-Markov-model inferences by corrupting batch data

Authors: William N. Caballero, Jose Manuel Camacho, Tahir Ekin, Roi Naveiro | Published: 2024-02-19
不確実性の定量化
攻撃の評価
攻撃手法