LLM-Text Watermarking based on Lagrange Interpolation

TOP Literature Database LLM-Text Watermarking based on Lagrange Interpolation

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2505.05712

PDF

https://arxiv.org/pdf/2505.05712

Paper Information

Author: Jarosław Janas,Paweł Morawiecki,Josef Pieprzyk
Published: 5-9-2025
Updated: 5-13-2025
Affiliation: Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
Country: Poland
Conference

Labels Estimated by AI

Prompt leaking LLM Security Digital Watermarking for Generative AI

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The rapid advancement of LLMs (Large Language Models) has established them as a foundational technology for many AI and ML-powered human computer interactions. A critical challenge in this context is the attribution of LLM-generated text -- either to the specific language model that produced it or to the individual user who embedded their identity via a so-called multi-bit watermark. This capability is essential for combating misinformation, fake news, misinterpretation, and plagiarism. One of the key techniques for addressing this challenge is digital watermarking. This work presents a watermarking scheme for LLM-generated text based on Lagrange interpolation, enabling the recovery of a multi-bit author identity even when the text has been heavily redacted by an adversary. The core idea is to embed a continuous sequence of points $(x, f(x))$ that lie on a single straight line. The $x$-coordinates are computed pseudorandomly using a cryptographic hash function $H$ applied to the concatenation of the previous token's identity and a secret key $s_k$. Crucially, the $x$-coordinates do not need to be embedded into the text -- only the corresponding $f(x)$ values are embedded. During extraction, the algorithm recovers the original points along with many spurious ones, forming an instance of the Maximum Collinear Points (MCP) problem, which can be solved efficiently. Experimental results demonstrate that the proposed method is highly effective, allowing the recovery of the author identity even when as few as three genuine points remain after adversarial manipulation.