ポイズニング

MalPurifier: Enhancing Android Malware Detection with Adversarial Purification against Evasion Attacks

Authors: Yuyang Zhou, Guang Cheng, Zongyao Chen, Shui Yu | Published: 2023-12-11
ポイズニング
ロバスト性評価
敵対的攻撃

Dr. Jekyll and Mr. Hyde: Two Faces of LLMs

Authors: Matteo Gioele Collu, Tom Janssen-Groesbeek, Stefanos Koffas, Mauro Conti, Stjepan Picek | Published: 2023-12-06 | Updated: 2024-10-07
キャラクター役割演技
プロンプトインジェクション
ポイズニング

Rethinking PGD Attack: Is Sign Function Necessary?

Authors: Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang | Published: 2023-12-03 | Updated: 2024-05-21
ポイズニング
ロバスト性評価
敵対的攻撃

The Philosopher’s Stone: Trojaning Plugins of Large Language Models

Authors: Tian Dong, Minhui Xue, Guoxing Chen, Rayne Holland, Yan Meng, Shaofeng Li, Zhen Liu, Haojin Zhu | Published: 2023-12-01 | Updated: 2024-09-11
プロンプトインジェクション
ポイズニング
ポイズニング攻撃

Exploring the Robustness of Decentralized Training for Large Language Models

Authors: Lin Lu, Chenxi Dai, Wangcheng Tao, Binhang Yuan, Yanan Sun, Pan Zhou | Published: 2023-12-01
プライバシー保護手法
ポイズニング
ポイズニング攻撃

Using Decentralized Aggregation for Federated Learning with Differential Privacy

Authors: Hadeel Abd El-Kareem, Abd El-Moaty Saleh, Ana Fernández-Vilas, Manuel Fernández-Veiga, asser El-Sonbaty | Published: 2023-11-27
プライバシー保護
ポイズニング
実験的検証

Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems

Authors: Guangjing Wang, Ce Zhou, Yuanda Wang, Bocheng Chen, Hanqing Guo, Qiben Yan | Published: 2023-11-20
プロンプトインジェクション
ポイズニング
転移学習

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

Authors: Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song | Published: 2023-11-19 | Updated: 2023-11-25
テキスト生成手法
バックドア攻撃
ポイズニング

Poisoning Retrieval Corpora by Injecting Adversarial Passages

Authors: Zexuan Zhong, Ziqing Huang, Alexander Wettig, Danqi Chen | Published: 2023-10-29
RAGへのポイズニング攻撃
ポイズニング
敵対的サンプル

Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks

Authors: Xinglong Chang, Katharina Dost, Gillian Dobbie, Jörg Wicker | Published: 2023-10-24
データ生成
ポイズニング
敵対的攻撃検出