Large language models (LLMs) have witnessed a meteoric rise in popularity
among the general public users over the past few months, facilitating diverse
downstream tasks with human-level accuracy and proficiency. Prompts play an
essential role in this success, which efficiently adapt pre-trained LLMs to
task-specific applications by simply prepending a sequence of tokens to the
query texts. However, designing and selecting an optimal prompt can be both
expensive and demanding, leading to the emergence of Prompt-as-a-Service
providers who profit by providing well-designed prompts for authorized use.
With the growing popularity of prompts and their indispensable role in
LLM-based services, there is an urgent need to protect the copyright of prompts
against unauthorized use.
In this paper, we propose PromptCARE, the first framework for prompt
copyright protection through watermark injection and verification. Prompt
watermarking presents unique challenges that render existing watermarking
techniques developed for model and dataset copyright verification ineffective.
PromptCARE overcomes these hurdles by proposing watermark injection and
verification schemes tailor-made for prompts and NLP characteristics. Extensive
experiments on six well-known benchmark datasets, using three prevalent
pre-trained LLMs (BERT, RoBERTa, and Facebook OPT-1.3b), demonstrate the
effectiveness, harmlessness, robustness, and stealthiness of PromptCARE.