AIセキュリティポータルbot

Are Normalizing Flows the Key to Unlocking the Exponential Mechanism?

Authors: Robert A. Bridges, Vandy J. Tombs, Christopher B. Stanley | Published: 2023-11-15 | Updated: 2024-06-11
プライバシー保護
収束特性
機械学習手法

Jailbreaking GPT-4V via Self-Adversarial Attacks with System Prompts

Authors: Yuanwei Wu, Xiang Li, Yixin Liu, Pan Zhou, Lichao Sun | Published: 2023-11-15 | Updated: 2024-01-20
プロンプトインジェクション
攻撃手法
顔認識

A Robust Semantics-based Watermark for Large Language Model against Paraphrasing

Authors: Jie Ren, Han Xu, Yiding Liu, Yingqian Cui, Shuaiqiang Wang, Dawei Yin, Jiliang Tang | Published: 2023-11-15 | Updated: 2024-04-01
プロンプトインジェクション
ロバスト性評価
情報隠蔽手法

KnowSafe: Combined Knowledge and Data Driven Hazard Mitigation in Artificial Pancreas Systems

Authors: Xugui Zhou, Maxfield Kouzel, Chloe Smith, Homa Alemzadeh | Published: 2023-11-13
CPSの制御モデル
制御アクション生成
危険予測と緩和

Adversarial Purification for Data-Driven Power System Event Classifiers with Diffusion Models

Authors: Yuanbin Cheng, Koji Yamashita, Jim Follum, Nanpeng Yu | Published: 2023-11-13
敵対的テキスト浄化
最適化問題
防御手法

Seeing is Believing: A Federated Learning Based Prototype to Detect Wireless Injection Attacks

Authors: Aadil Hussain, Nitheesh Gundapu, Sarang Drugkar, Suraj Kiran, J. Harshan, Ranjitha Prasad | Published: 2023-11-11
学習の改善
深層学習手法
防御手法

Does Differential Privacy Prevent Backdoor Attacks in Practice?

Authors: Fereshteh Razmi, Jian Lou, Li Xiong | Published: 2023-11-10
データプライバシー評価
トレードオフ分析
防御手法

Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service

Authors: Yuanmin Tang, Jing Yu, Keke Gai, Xiangyan Qu, Yue Hu, Gang Xiong, Qi Wu | Published: 2023-11-10
データプライバシー評価
メンバーシップ推論
著作権トラップ

RAGLog: Log Anomaly Detection using Retrieval Augmented Generation

Authors: Jonathan Pan, Swee Liang Wong, Yidi Yuan | Published: 2023-11-09
クラスタリング手法
クラス不均衡
ログ分析の課題

DEMASQ: Unmasking the ChatGPT Wordsmith

Authors: Kavita Kumari, Alessandro Pegoraro, Hossein Fereidooni, Ahmad-Reza Sadeghi | Published: 2023-11-08
エネルギーベースモデル
プロンプトインジェクション
評価手法