Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications

TOP Literature Database Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2401.07612

PDF

https://arxiv.org/pdf/2401.07612

Paper Information

Author: Xuchen Suo
Published: 1-15-2024
Affiliation: Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University
Country: Hong Kong, China
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Prompt Injection LLM Security

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The critical challenge of prompt injection attacks in Large Language Models (LLMs) integrated applications, a growing concern in the Artificial Intelligence (AI) field. Such attacks, which manipulate LLMs through natural language inputs, pose a significant threat to the security of these applications. Traditional defense strategies, including output and input filtering, as well as delimiter use, have proven inadequate. This paper introduces the 'Signed-Prompt' method as a novel solution. The study involves signing sensitive instructions within command segments by authorized users, enabling the LLM to discern trusted instruction sources. The paper presents a comprehensive analysis of prompt injection attack patterns, followed by a detailed explanation of the Signed-Prompt concept, including its basic architecture and implementation through both prompt engineering and fine-tuning of LLMs. Experiments demonstrate the effectiveness of the Signed-Prompt method, showing substantial resistance to various types of prompt injection attacks, thus validating its potential as a robust defense strategy in AI security.

External Datasets

Delete Command Dataset