These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Large language model (LLM)-powered agents are increasingly used in
recommender systems (RSs) to achieve personalized behavior modeling, where the
memory mechanism plays a pivotal role in enabling the agents to autonomously
explore, learn and self-evolve from real-world interactions. However, this very
mechanism, serving as a contextual repository, inherently exposes an attack
surface for potential adversarial manipulations. Despite its central role, the
robustness of agentic RSs in the face of such threats remains largely
underexplored. Previous works suffer from semantic mismatches or rely on static
embeddings or pre-defined prompts, all of which are not designed for dynamic
systems, especially for dynamic memory states of LLM agents. This challenge is
exacerbated by the black-box nature of commercial recommenders.
To tackle the above problems, in this paper, we present the first systematic
investigation of memory-based vulnerabilities in LLM-powered recommender
agents, revealing their security limitations and guiding efforts to strengthen
system resilience and trustworthiness. Specifically, we propose a novel
black-box attack framework named DrunkAgent. DrunkAgent crafts semantically
meaningful adversarial textual triggers for target item promotions and
introduces a series of strategies to maximize the trigger effect by corrupting
the memory updates during the interactions. The triggers and strategies are
optimized on a surrogate model, enabling DrunkAgent transferable and stealthy.
Extensive experiments on real-world datasets across diverse agentic RSs,
including collaborative filtering, retrieval augmentation and sequential
recommendations, demonstrate the generalizability, transferability and
stealthiness of DrunkAgent.