These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Large language model (LLM)-based computer-use agents represent a convergence
of AI and OS capabilities, enabling natural language to control system- and
application-level functions. However, due to LLMs' inherent uncertainty issues,
granting agents control over computers poses significant security risks. When
agent actions deviate from user intentions, they can cause irreversible
consequences. Existing mitigation approaches, such as user confirmation and
LLM-based dynamic action validation, still suffer from limitations in
usability, security, and performance. To address these challenges, we propose
CSAgent, a system-level, static policy-based access control framework for
computer-use agents. To bridge the gap between static policy and dynamic
context and user intent, CSAgent introduces intent- and context-aware policies,
and provides an automated toolchain to assist developers in constructing and
refining them. CSAgent enforces these policies through an optimized OS service,
ensuring that agent actions can only be executed under specific user intents
and contexts. CSAgent supports protecting agents that control computers through
diverse interfaces, including API, CLI, and GUI. We implement and evaluate
CSAgent, which successfully defends against more than 99.36% of attacks while
introducing only 6.83% performance overhead.