From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models

TOP Literature Database From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2505.09924

PDF

https://arxiv.org/pdf/2505.09924

Paper Information

Author: Yidan Wang,Yubing Ren,Yanan Cao,Binxing Fang
Published: 5-15-2025
Affiliation: Institute of Information Engineering, Chinese Academy of Sciences
Country: China
Conference: Annual Meeting of the Association for Computational Linguistics (ACL)

Labels Estimated by AI

Watermark Removal Technology Digital Watermarking for Generative AI Model DoS

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The rise of Large Language Models (LLMs) has heightened concerns about the misuse of AI-generated text, making watermarking a promising solution. Mainstream watermarking schemes for LLMs fall into two categories: logits-based and sampling-based. However, current schemes entail trade-offs among robustness, text quality, and security. To mitigate this, we integrate logits-based and sampling-based schemes, harnessing their respective strengths to achieve synergy. In this paper, we propose a versatile symbiotic watermarking framework with three strategies: serial, parallel, and hybrid. The hybrid framework adaptively embeds watermarks using token entropy and semantic entropy, optimizing the balance between detectability, robustness, text quality, and security. Furthermore, we validate our approach through comprehensive experiments on various datasets and models. Experimental results indicate that our method outperforms existing baselines and achieves state-of-the-art (SOTA) performance. We believe this framework provides novel insights into diverse watermarking paradigms. Our code is available at \href{https://github.com/redwyd/SymMark}{https://github.com/redwyd/SymMark}.

External Datasets

C4 dataset

OpenGen dataset

Copen dataset

ELI5 dataset

LCC dataset

MultiNews dataset