防御メカニズム

Privacy and Security Threat for OpenAI GPTs

Authors: Wei Wenying, Zhao Kaifa, Xue Lei, Fan Ming | Published: 2025-06-04
LLMの安全機構の解除
プライバシー問題
防御メカニズム

SuperPure: Efficient Purification of Localized and Distributed Adversarial Patches via Super-Resolution GAN Models

Authors: Hossein Khalili, Seongbin Park, Venkat Bollapragada, Nader Sehatbakhsh | Published: 2025-05-22
敵対的学習
計算複雑性
防御メカニズム

Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval

Authors: Taiye Chen, Zeming Wei, Ang Li, Yisen Wang | Published: 2025-05-21
RAG
大規模言語モデル
防御メカニズム

Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses

Authors: Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, Yves-Alexandre de Montjoye | Published: 2025-05-21
アライメント
プロンプトインジェクション
防御メカニズム

Model-agnostic clean-label backdoor mitigation in cybersecurity environments

Authors: Giorgio Severi, Simona Boboila, John Holodnak, Kendra Kratkiewicz, Rauf Izmailov, Michael J. De Lucia, Alina Oprea | Published: 2024-07-11 | Updated: 2025-05-05
バックドアモデルの検知
バックドア攻撃
防御メカニズム

Large Language Model Sentinel: LLM Agent for Adversarial Purification

Authors: Guang Lin, Toshihisa Tanaka, Qibin Zhao | Published: 2024-05-24 | Updated: 2025-04-23
プロンプトの検証
敵対的テキスト浄化
防御メカニズム

ModSec-AdvLearn: Countering Adversarial SQL Injections with Robust Machine Learning

Authors: Giuseppe Floris, Christian Scano, Biagio Montaruli, Luca Demetrio, Andrea Valenza, Luca Compagna, Davide Ariu, Luca Piras, Davide Balzarotti, Battista Biggio | Published: 2023-08-09 | Updated: 2025-05-21
ロバスト性とプライバシーの関係
敵対的サンプルの検知
防御メカニズム

Defend Data Poisoning Attacks on Voice Authentication

Authors: Ke Li, Cameron Baird, Dan Lin | Published: 2022-09-09 | Updated: 2023-07-07
モデル設計
敵対的攻撃検出
防御メカニズム

Understanding Training-Data Leakage from Gradients in Neural Networks for Image Classification

Authors: Cangxiong Chen, Neill D. F. Campbell | Published: 2021-11-19
トレーニングデータ抽出手法
再構成攻撃
防御メカニズム

A Review of Adversarial Attack and Defense for Classification Methods

Authors: Yao Li, Minhao Cheng, Cho-Jui Hsieh, Thomas C. M. Lee | Published: 2021-11-18
敵対的サンプル
敵対的攻撃
防御メカニズム