Knowledge Return Oriented Prompting (KROP)

Authors: Jason Martin, Kenneth Yeung
Published: 2024-06-11

Source: https://arxiv.org/abs/2406.11880

Labels Predicted by AI

Prompt Injection LLM Security Attack Method

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Many Large Language Models (LLMs) and LLM-powered apps deployed today use some form of prompt filter or alignment to protect their integrity. However, these measures aren’t foolproof. This paper introduces KROP, a prompt injection technique capable of obfuscating prompt injection attacks, rendering them virtually undetectable to most of these security measures.