LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge

TOP 文献データベース LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge

Computing Research Repository (CoRR)

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2506.09956

PDF

https://arxiv.org/pdf/2506.09956

文献情報

作者: Sahar Abdelnabi,Aideen Fay,Ahmed Salem,Egor Zverev,Kai-Chieh Liao,Chi-Huang Liu,Chun-Chih Kuo,Jannis Weigend,Danyael Manlangit,Alex Apostolov,Haris Umair,João Donato,Masayuki Kawakita,Athar Mahboob,Tran Huu Bach,Tsun-Han Chiang,Myeongjin Cho,Hajin Choi,Byeonghyeon Kim,Hyeonjin Lee,Benjamin Pannell,Conor McCauley,Mark Russinovich,Andrew Paverd,Giovanni Cherubin
公開日: 2025-6-13
所属機関: Microsoft
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

インダイレクトプロンプトインジェクションプロンプトインジェクション防御手法

Abstract

Indirect Prompt Injection attacks exploit the inherent limitation of Large Language Models (LLMs) to distinguish between instructions and data in their inputs. Despite numerous defense proposals, the systematic evaluation against adaptive adversaries remains limited, even when successful attacks can have wide security and privacy implications, and many real-world LLM-based applications remain vulnerable. We present the results of LLMail-Inject, a public challenge simulating a realistic scenario in which participants adaptively attempted to inject malicious instructions into emails in order to trigger unauthorized tool calls in an LLM-based email assistant. The challenge spanned multiple defense strategies, LLM architectures, and retrieval configurations, resulting in a dataset of 208,095 unique attack submissions from 839 participants. We release the challenge code, the full dataset of submissions, and our analysis demonstrating how this data can provide new insights into the instruction-data separation problem. We hope this will serve as a foundation for future research towards practical structural solutions to prompt injection.