Knowledge Return Oriented Prompting (KROP)

TOP Literature Database Knowledge Return Oriented Prompting (KROP)

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2406.11880

PDF

https://arxiv.org/pdf/2406.11880

Paper Information

Author: Jason Martin;Kenneth Yeung
Published: 6-12-2024
Affiliation: HiddenLayer
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Prompt Injection LLM Security Attack Method

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Many Large Language Models (LLMs) and LLM-powered apps deployed today use some form of prompt filter or alignment to protect their integrity. However, these measures aren't foolproof. This paper introduces KROP, a prompt injection technique capable of obfuscating prompt injection attacks, rendering them virtually undetectable to most of these security measures.