Beyond Context: Large Language Models Failure to Grasp Users Intent

TOP 文献データベース Beyond Context: Large Language Models Failure to Grasp Users Intent

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2512.21110

PDF

https://arxiv.org/pdf/2512.21110

文献情報

作者: Ahmed M. Hussain,Salahuddin Salahuddin,Panos Papadimitratos
公開日: 2025-12-26
所属機関: KTH Royal Institute of Technology
所属の国: Sweden
会議名

AIにより推定されたラベル

脆弱性優先順位付けインダイレクトプロンプトインジェクションマルチモーダル安全性

Abstract

Current Large Language Models (LLMs) safety approaches focus on explicitly harmful content while overlooking a critical vulnerability: the inability to understand context and recognize user intent. This creates exploitable vulnerabilities that malicious users can systematically leverage to circumvent safety mechanisms. We empirically evaluate multiple state-of-the-art LLMs, including ChatGPT, Claude, Gemini, and DeepSeek. Our analysis demonstrates the circumvention of reliable safety mechanisms through emotional framing, progressive revelation, and academic justification techniques. Notably, reasoning-enabled configurations amplified rather than mitigated the effectiveness of exploitation, increasing factual precision while failing to interrogate the underlying intent. The exception was Claude Opus 4.1, which prioritized intent detection over information provision in some use cases. This pattern reveals that current architectural designs create systematic vulnerabilities. These limitations require paradigmatic shifts toward contextual understanding and intent recognition as core safety capabilities rather than post-hoc protective mechanisms.