These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Retrieval-Augmented Generation (RAG) systems enhance large language models
(LLMs) by incorporating external knowledge bases, but this may expose them to
extraction attacks, leading to potential copyright and privacy risks. However,
existing extraction methods typically rely on malicious inputs such as prompt
injection or jailbreaking, making them easily detectable via input- or
output-level detection. In this paper, we introduce Implicit Knowledge
Extraction Attack (IKEA), which conducts Knowledge Extraction on RAG systems
through benign queries. Specifically, IKEA first leverages anchor
concepts-keywords related to internal knowledge-to generate queries with a
natural appearance, and then designs two mechanisms that lead anchor concepts
to thoroughly "explore" the RAG's knowledge: (1) Experience Reflection
Sampling, which samples anchor concepts based on past query-response histories,
ensuring their relevance to the topic; (2) Trust Region Directed Mutation,
which iteratively mutates anchor concepts under similarity constraints to
further exploit the embedding space. Extensive experiments demonstrate IKEA's
effectiveness under various defenses, surpassing baselines by over 80% in
extraction efficiency and 90% in attack success rate. Moreover, the substitute
RAG system built from IKEA's extractions shows comparable performance to the
original RAG and outperforms those based on baselines across multiple
evaluation tasks, underscoring the stealthy copyright infringement risk in RAG
systems.