Semantic-level watermarking (SWM) for large language models (LLMs) enhances
watermarking robustness against text modifications and paraphrasing attacks by
treating the sentence as the fundamental unit. However, existing methods still
lack strong theoretical guarantees of robustness, and reject-sampling-based
generation often introduces significant distribution distortions compared with
unwatermarked outputs. In this work, we introduce a new theoretical framework
on SWM through the concept of proxy functions (PFs) $\unicode{x2013}$ functions
that map sentences to scalar values. Building on this framework, we propose
PMark, a simple yet powerful SWM method that estimates the PF median for the
next sentence dynamically through sampling while enforcing multiple PF
constraints (which we call channels) to strengthen watermark evidence. Equipped
with solid theoretical guarantees, PMark achieves the desired distortion-free
property and improves the robustness against paraphrasing-style attacks. We
also provide an empirically optimized version that further removes the
requirement for dynamical median estimation for better sampling efficiency.
Experimental results show that PMark consistently outperforms existing SWM
baselines in both text quality and robustness, offering a more effective
paradigm for detecting machine-generated text. Our code will be released at
[this URL](https://github.com/PMark-repo/PMark).