These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The increasing use of Large Language Models (LLMs) for generating highly
coherent and contextually relevant text introduces new risks, including misuse
for unethical purposes such as disinformation or academic dishonesty. To
address these challenges, we propose FreqMark, a novel watermarking technique
that embeds detectable frequency-based watermarks in LLM-generated text during
the token sampling process. The method leverages periodic signals to guide
token selection, creating a watermark that can be detected with Short-Time
Fourier Transform (STFT) analysis. This approach enables accurate
identification of LLM-generated content, even in mixed-text scenarios with both
human-authored and LLM-generated segments. Our experiments demonstrate the
robustness and precision of FreqMark, showing strong detection capabilities
against various attack scenarios such as paraphrasing and token substitution.
Results show that FreqMark achieves an AUC improvement of up to 0.98,
significantly outperforming existing detection methods.