AIセキュリティポータルbot | ページ 172 | AIセキュリティポータル

Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

Authors: Adrian Pekar, Richard Jozsa | Published: 2024-07-03 | Updated: 2025-06-30

トラフィック分類

侵入検知システム

性能評価指標

2024.07.03

文献データベース

From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks

Authors: Zhexin Zhang, Junxiao Yang, Yida Lu, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang | Published: 2024-07-03 | Updated: 2025-05-20

プロンプトインジェクション

大規模言語モデル

法執行回避

2024.07.03

文献データベース

MALT Powers Up Adversarial Attacks

Authors: Odelia Melamed, Gilad Yehudai, Adi Shamir | Published: 2024-07-02

メソスコピック線形性

攻撃手法

評価手法

2024.07.02 2025.04.03

文献データベース

Attack-Aware Noise Calibration for Differential Privacy

Authors: Bogdan Kulynych, Juan Felipe Gomez, Georgios Kaissis, Flavio du Pin Calmon, Carmela Troncoso | Published: 2024-07-02 | Updated: 2024-11-07

プライバシー保護

プライバシー保護手法

計算効率

2024.07.02 2025.04.03

文献データベース

On Discrete Prompt Optimization for Diffusion Models

Authors: Ruochen Wang, Ting Liu, Cho-Jui Hsieh, Boqing Gong | Published: 2024-06-27

ウォーターマーキング

プロンプトインジェクション

プロンプトエンジニアリング

2024.06.27 2025.04.03

文献データベース

Diffusion-based Adversarial Purification for Intrusion Detection

Authors: Mohamed Amine Merzouk, Erwan Beurier, Reda Yaich, Nora Boulahia-Cuppens, Frédéric Cuppens | Published: 2024-06-25

データ前処理

敵対的訓練

自動化された侵入検知システム

2024.06.25 2025.04.03

文献データベース

Treatment of Statistical Estimation Problems in Randomized Smoothing for Adversarial Robustness

Authors: Vaclav Voracek | Published: 2024-06-25 | Updated: 2025-01-20

信頼評価モジュール

評価手法

透かし評価

2024.06.25 2025.04.03

文献データベース

The Effect of Similarity Measures on Accurate Stability Estimates for Local Surrogate Models in Text-based Explainable AI

Authors: Christopher Burger, Charles Walter, Thai Le | Published: 2024-06-22 | Updated: 2025-01-17

敵対的サンプル

評価手法

類似性測定

2024.06.22 2025.04.03

文献データベース

Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning

Authors: Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Daogao Liu, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang | Published: 2024-06-20 | Updated: 2024-08-16

ウォーターマーキング

データ選択戦略

プライバシー保護手法

2024.06.20 2025.04.03

文献データベース

Let the Noise Speak: Harnessing Noise for a Unified Defense Against Adversarial and Backdoor Attacks

Authors: Md Hasan Shahriar, Ning Wang, Naren Ramakrishnan, Y. Thomas Hou, Wenjing Lou | Published: 2024-06-18 | Updated: 2025-04-14

モデルの頑健性保証

再構成攻撃

敵対的攻撃検出

2024.06.18

文献データベース