These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
LLM-integrated app systems extend the utility of Large Language Models (LLMs)
with third-party apps that are invoked by a system LLM using interleaved
planning and execution phases to answer user queries. These systems introduce
new attack vectors where malicious apps can cause integrity violation of
planning or execution, availability breakdown, or privacy compromise during
execution.
In this work, we identify new attacks impacting the integrity of planning, as
well as the integrity and availability of execution in LLM-integrated apps, and
demonstrate them against IsolateGPT, a recent solution designed to mitigate
attacks from malicious apps. We propose Abstract-Concrete-Execute (ACE), a new
secure architecture for LLM-integrated app systems that provides security
guarantees for system planning and execution. Specifically, ACE decouples
planning into two phases by first creating an abstract execution plan using
only trusted information, and then mapping the abstract plan to a concrete plan
using installed system apps. We verify that the plans generated by our system
satisfy user-specified secure information flow constraints via static analysis
on the structured plan output. During execution, ACE enforces data and
capability barriers between apps, and ensures that the execution is conducted
according to the trusted abstract plan. We show experimentally that our system
is secure against attacks from the INJECAGENT benchmark, a standard benchmark
for control flow integrity in the face of indirect prompt injection attacks,
and our newly introduced attacks. Our architecture represents a significant
advancement towards hardening LLM-based systems containing system facilities of
varying levels of trustworthiness.