RAG

Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval

Authors: Taiye Chen, Zeming Wei, Ang Li, Yisen Wang | Published: 2025-05-21
RAG
大規模言語モデル
防御メカニズム

Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries

Authors: Yuhao Wang, Wenjie Qu, Yanze Jiang, Zichen Liu, Yue Liu, Shengfang Zhai, Yinpeng Dong, Jiaheng Zhang | Published: 2025-05-21
RAG
RAGへのポイズニング攻撃
プライバシー損失分析

Adaptive Plan-Execute Framework for Smart Contract Security Auditing

Authors: Zhiyuan Wei, Jing Sun, Zijian Zhang, Zhe Hou, Zixiao Zhao | Published: 2025-05-21
RAG
プロンプトリーキング
動的分析

Phare: A Safety Probe for Large Language Models

Authors: Pierre Le Jeune, Benoît Malézieux, Weixuan Xiao, Matteo Dora | Published: 2025-05-16 | Updated: 2025-05-19
RAG
バイアス緩和手法
ハルシネーション

AutoPentest: Enhancing Vulnerability Management With Autonomous LLM Agents

Authors: Julius Henke | Published: 2025-05-15
LLMセキュリティ
RAG
インダイレクトプロンプトインジェクション

Securing RAG: A Risk Assessment and Mitigation Framework

Authors: Lukas Ammann, Sara Ott, Christoph R. Landolt, Marco P. Lehmann | Published: 2025-05-13
LLMセキュリティ
RAG
RAGへのポイズニング攻撃

AutoPatch: Multi-Agent Framework for Patching Real-World CVE Vulnerabilities

Authors: Minjae Seo, Wonwoo Choi, Myoungsung You, Seungwon Shin | Published: 2025-05-07
RAG
モデルDoS
脆弱性分析

The Steganographic Potentials of Language Models

Authors: Artem Karpov, Tinuade Adeleke, Seong Hah Cho, Natalia Perez-Campanero | Published: 2025-05-06
RAG
著者貢献
透かし

Directed Greybox Fuzzing via Large Language Model

Authors: Hanxiang Xu, Yanjie Zhao, Haoyu Wang | Published: 2025-05-06
RAG
プロンプトインジェクション
脆弱性分析

Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs

Authors: Amrita Roy Chowdhury, David Glukhov, Divyam Anshumaan, Prasad Chalasani, Nicolas Papernot, Somesh Jha, Mihir Bellare | Published: 2025-04-07
RAG
インダイレクトプロンプトインジェクション
プライバシー分析