AIセキュリティポータル K Program
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
Share
Abstract
Large language models (LLMs) have achieved remarkable success due to their exceptional generative capabilities. Despite their success, they also have inherent limitations such as a lack of up-to-date knowledge and hallucination. Retrieval-Augmented Generation (RAG) is a state-of-the-art technique to mitigate these limitations. The key idea of RAG is to ground the answer generation of an LLM on external knowledge retrieved from a knowledge database. Existing studies mainly focus on improving the accuracy or efficiency of RAG, leaving its security largely unexplored. We aim to bridge the gap in this work. We find that the knowledge database in a RAG system introduces a new and practical attack surface. Based on this attack surface, we propose PoisonedRAG, the first knowledge corruption attack to RAG, where an attacker could inject a few malicious texts into the knowledge database of a RAG system to induce an LLM to generate an attacker-chosen target answer for an attacker-chosen target question. We formulate knowledge corruption attacks as an optimization problem, whose solution is a set of malicious texts. Depending on the background knowledge (e.g., black-box and white-box settings) of an attacker on a RAG system, we propose two solutions to solve the optimization problem, respectively. Our results show PoisonedRAG could achieve a 90% attack success rate when injecting five malicious texts for each target question into a knowledge database with millions of texts. We also evaluate several defenses and our results show they are insufficient to defend against PoisonedRAG, highlighting the need for new defenses.
Language models are few-shot learners
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei
Published: 2020
Survey of hallucination in natural language generation
Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. J. Bang, A. Madotto, P. Fung
Published: 2023
Transforming healthcare education: Harnessing large language models for frontline health worker capacity building using retrieval-augmented generation
Y. Al Ghadban, H. Y. Lu, U. Adavi, A. Sharma, S. Gara, N. Das, B. Kumar, R. John, P. Devarsetty, J. E. Hirst
Published: 2023
Potential for GPT technology to optimize future clinical decision-making using retrieval-augmented generation
Calvin Wang, Joshua Ong, Chara Wang, Hannah Ong, Rebekah Cheng, Dennis Ong
Published: 2024
Making llms worth every penny: Resource-limited text classification in banking
Lefteris Loukas, Ilias Stogiannidis, Odysseas Diamantopoulos, Prodromos Malakasiotis, Stavros Vassos
Published: 2023
Chain of reference prompting helps llm to think like a lawyer
Aditya Kuppa, Nikon Rasumov-Rahe, Marc Voses
Published: 2023
Mycrunchgpt: A llm assisted framework for scientific machine learning
V. Kumar, L. Gleyzer, A. Kahana, K. Shukla, G. E. Karniadakis
Published: 2023
Share