PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models | AIセキュリティポータル

EN

JA

EN

TOP 文献データベース PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

arxiv

PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2402.07867

PDF

https://arxiv.org/pdf/2402.07867

文献情報

作者: Wei Zou;Runpeng Geng;Binghui Wang;Jinyuan Jia
公開日: 2024-2-13
更新日: 2024-8-13
所属機関: Pennsylvania State University
所属の国: United States of America
会議名: USENIX Security Symposium

AIにより推定されたラベル

プロンプトインジェクションポイズニング攻撃ポイズニング

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Large language models (LLMs) have achieved remarkable success due to their exceptional generative capabilities. Despite their success, they also have inherent limitations such as a lack of up-to-date knowledge and hallucination. Retrieval-Augmented Generation (RAG) is a state-of-the-art technique to mitigate these limitations. The key idea of RAG is to ground the answer generation of an LLM on external knowledge retrieved from a knowledge database. Existing studies mainly focus on improving the accuracy or efficiency of RAG, leaving its security largely unexplored. We aim to bridge the gap in this work. We find that the knowledge database in a RAG system introduces a new and practical attack surface. Based on this attack surface, we propose PoisonedRAG, the first knowledge corruption attack to RAG, where an attacker could inject a few malicious texts into the knowledge database of a RAG system to induce an LLM to generate an attacker-chosen target answer for an attacker-chosen target question. We formulate knowledge corruption attacks as an optimization problem, whose solution is a set of malicious texts. Depending on the background knowledge (e.g., black-box and white-box settings) of an attacker on a RAG system, we propose two solutions to solve the optimization problem, respectively. Our results show PoisonedRAG could achieve a 90% attack success rate when injecting five malicious texts for each target question into a knowledge database with millions of texts. We also evaluate several defenses and our results show they are insufficient to defend against PoisonedRAG, highlighting the need for new defenses.

外部データセット

Natural Questions (NQ)

HotpotQA

MS-MARCO

参考文献

OpenAI Technical Report

Language models are few-shot learners

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei

Published: 2020

Cryptology ePrint Archive

Optimizations of side-channel attack on AES MixColumns using chosen input

A. Vasselle, A. Wurcker

Published: 2019

ACM Computing Surveys

Survey of hallucination in natural language generation

Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. J. Bang, A. Madotto, P. Fung

Published: 2023

medRxiv

Transforming healthcare education: Harnessing large language models for frontline health worker capacity building using retrieval-augmented generation

Y. Al Ghadban, H. Y. Lu, U. Adavi, A. Sharma, S. Gara, N. Das, B. Kumar, R. John, P. Devarsetty, J. E. Hirst

Published: 2023

Annals of Biomedical Engineering

Potential for GPT technology to optimize future clinical decision-making using retrieval-augmented generation

Calvin Wang, Joshua Ong, Chara Wang, Hannah Ong, Rebekah Cheng, Dennis Ong

Published: 2024

Proceedings of the Fourth ACM International Conference on AI in Finance

Making llms worth every penny: Resource-limited text classification in banking

Lefteris Loukas, Ilias Stogiannidis, Odysseas Diamantopoulos, Prodromos Malakasiotis, Stavros Vassos

Published: 2023

Generative AI+ Law Workshop

Chain of reference prompting helps llm to think like a lawyer

Aditya Kuppa, Nikon Rasumov-Rahe, Marc Voses

Published: 2023

Journal of Machine Learning for Modeling and Computing

Mycrunchgpt: A llm assisted framework for scientific machine learning

V. Kumar, L. Gleyzer, A. Kahana, K. Shukla, G. E. Karniadakis

Published: 2023