The critical challenge of prompt injection attacks in Large Language Models
(LLMs) integrated applications, a growing concern in the Artificial
Intelligence (AI) field. Such attacks, which manipulate LLMs through natural
language inputs, pose a significant threat to the security of these
applications. Traditional defense strategies, including output and input
filtering, as well as delimiter use, have proven inadequate. This paper
introduces the 'Signed-Prompt' method as a novel solution. The study involves
signing sensitive instructions within command segments by authorized users,
enabling the LLM to discern trusted instruction sources. The paper presents a
comprehensive analysis of prompt injection attack patterns, followed by a
detailed explanation of the Signed-Prompt concept, including its basic
architecture and implementation through both prompt engineering and fine-tuning
of LLMs. Experiments demonstrate the effectiveness of the Signed-Prompt method,
showing substantial resistance to various types of prompt injection attacks,
thus validating its potential as a robust defense strategy in AI security.