These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Large Language Models (LLMs) are combined with tools to create powerful LLM
agents that provide a wide range of services. Unlike traditional software, LLM
agent's behavior is determined at runtime by natural language prompts from
either user or tool's data. This flexibility enables a new computing paradigm
with unlimited capabilities and programmability, but also introduces new
security risks, vulnerable to privilege escalation attacks. Moreover, user
prompts are prone to be interpreted in an insecure way by LLM agents, creating
non-deterministic behaviors that can be exploited by attackers. To address
these security risks, we propose Prompt Flow Integrity (PFI), a system
security-oriented solution to prevent privilege escalation in LLM agents.
Analyzing the architectural characteristics of LLM agents, PFI features three
mitigation techniques -- i.e., agent isolation, secure untrusted data
processing, and privilege escalation guardrails. Our evaluation result shows
that PFI effectively mitigates privilege escalation attacks while successfully
preserving the utility of LLM agents.