DrunkAgent: Stealthy Memory Corruption in LLM-Powered Recommender Agents

TOP Literature Database DrunkAgent: Stealthy Memory Corruption in LLM-Powered Recommender Agents

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2503.23804

PDF

https://arxiv.org/pdf/2503.23804

Paper Information

Author: Shiyi Yang,Zhibo Hu,Xinshu Li,Chen Wang,Tong Yu,Xiwei Xu,Liming Zhu,Lina Yao
Published: 3-31-2025
Updated: 10-21-2025
Affiliation: The University of New South Wales
Country: Australia
Conference

Labels Estimated by AI

Model DoS LLM Security Indirect Prompt Injection

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Large language model (LLM)-powered agents are increasingly used in recommender systems (RSs) to achieve personalized behavior modeling, where the memory mechanism plays a pivotal role in enabling the agents to autonomously explore, learn and self-evolve from real-world interactions. However, this very mechanism, serving as a contextual repository, inherently exposes an attack surface for potential adversarial manipulations. Despite its central role, the robustness of agentic RSs in the face of such threats remains largely underexplored. Previous works suffer from semantic mismatches or rely on static embeddings or pre-defined prompts, all of which are not designed for dynamic systems, especially for dynamic memory states of LLM agents. This challenge is exacerbated by the black-box nature of commercial recommenders. To tackle the above problems, in this paper, we present the first systematic investigation of memory-based vulnerabilities in LLM-powered recommender agents, revealing their security limitations and guiding efforts to strengthen system resilience and trustworthiness. Specifically, we propose a novel black-box attack framework named DrunkAgent. DrunkAgent crafts semantically meaningful adversarial textual triggers for target item promotions and introduces a series of strategies to maximize the trigger effect by corrupting the memory updates during the interactions. The triggers and strategies are optimized on a surrogate model, enabling DrunkAgent transferable and stealthy. Extensive experiments on real-world datasets across diverse agentic RSs, including collaborative filtering, retrieval augmentation and sequential recommendations, demonstrate the generalizability, transferability and stealthiness of DrunkAgent.

External Datasets

CDs & Vinyl

Office Products

Musical Instruments

Yelp