AIセキュリティポータルbot

From Sands to Mansions: Towards Automated Cyberattack Emulation with Classical Planning and Large Language Models

Authors: Lingzhi Wang, Zhenyuan Li, Yi Jiang, Zhengkai Wang, Zonghan Guo, Jiahui Wang, Yangyang Wei, Xiangmin Shen, Wei Ruan, Yan Chen | Published: 2024-07-24 | Updated: 2025-04-17
プロンプトリーキング
攻撃アクションモデル
攻撃検出手法

Theoretical Analysis of Privacy Leakage in Trustworthy Federated Learning: A Perspective from Linear Algebra and Optimization Theory

Authors: Xiaojin Zhang, Wei Chen | Published: 2024-07-23
プライバシー保護
プライバシー保護手法
最適化問題

Private prediction for large-scale synthetic text generation

Authors: Kareem Amin, Alex Bie, Weiwei Kong, Alexey Kurakin, Natalia Ponomareva, Umar Syed, Andreas Terzis, Sergei Vassilvitskii | Published: 2024-07-16 | Updated: 2024-10-09
ウォーターマーキング
プライバシー保護手法
プロンプトインジェクション

Variational Randomized Smoothing for Sample-Wise Adversarial Robustness

Authors: Ryo Hase, Ye Wang, Toshiaki Koike-Akino, Jing Liu, Kieran Parsons | Published: 2024-07-16
正則化
透かしの耐久性
防御手法

Investigating Imperceptibility of Adversarial Attacks on Tabular Data: An Empirical Analysis

Authors: Zhipeng He, Chun Ouyang, Laith Alzubaidi, Alistair Barros, Catarina Moreira | Published: 2024-07-16 | Updated: 2024-10-04
モデル性能評価
攻撃の評価
特徴の相互依存性

SLIP: Securing LLMs IP Using Weights Decomposition

Authors: Yehonathan Refael, Adam Hakim, Lev Greenberg, Tal Aviv, Satya Lokam, Ben Fishman, Shachar Seidman | Published: 2024-07-15 | Updated: 2024-08-01
LLMセキュリティ
ウォーターマーキング
セキュアな通信チャネル

Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks

Authors: Lukas Gosch, Mahalakshmi Sabanayagam, Debarghya Ghoshdastidar, Stephan Günnemann | Published: 2024-07-15 | Updated: 2024-10-14
バックドア攻撃
ポイズニング
最適化問題

Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents

Authors: Yulong Yang, Xinshan Yang, Shuaidong Li, Chenhao Lin, Zhengyu Zhao, Chao Shen, Tianwei Zhang | Published: 2024-07-12 | Updated: 2025-03-16
インダイレクトプロンプトインジェクション
攻撃手法
脆弱性攻撃手法

TPIA: Towards Target-specific Prompt Injection Attack against Code-oriented Large Language Models

Authors: Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren, Chun Chen | Published: 2024-07-12 | Updated: 2025-01-16
LLMセキュリティ
プロンプトインジェクション
攻撃手法

Refusing Safe Prompts for Multi-modal Large Language Models

Authors: Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong | Published: 2024-07-12 | Updated: 2024-09-05
LLMセキュリティ
プロンプトインジェクション
評価手法