Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

TOP Literature Database Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2402.18059

PDF

https://arxiv.org/pdf/2402.18059

Paper Information

Author: Mingjia Huo;Sai Ashish Somayajula;Youwei Liang;Ruisi Zhang;Farinaz Koushanfar;Pengtao Xie
Published: 2-28-2024
Updated: 6-6-2024
Affiliation: Department of Electrical and Computer Engineering, University of California, San Diego
Country: United States of America
Conference: International Conference on Machine Learning (ICML)

Labels Estimated by AI

Watermarking Prompt Injection Multi-Objective Optimization

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Large language models generate high-quality responses with potential misinformation, underscoring the need for regulation by distinguishing AI-generated and human-written texts. Watermarking is pivotal in this context, which involves embedding hidden markers in texts during the LLM inference phase, which is imperceptible to humans. Achieving both the detectability of inserted watermarks and the semantic quality of generated texts is challenging. While current watermarking algorithms have made promising progress in this direction, there remains significant scope for improvement. To address these challenges, we introduce a novel multi-objective optimization (MOO) approach for watermarking that utilizes lightweight networks to generate token-specific watermarking logits and splitting ratios. By leveraging MOO to optimize for both detection and semantic objective functions, our method simultaneously achieves detectability and semantic integrity. Experimental results show that our method outperforms current watermarking techniques in enhancing the detectability of texts generated by LLMs while maintaining their semantic coherence. Our code is available at https://github.com/mignonjia/TS_watermark.

External Datasets

Essays

HC3