Large Language Models (LLMs) watermarking has shown promise in detecting
AI-generated content and mitigating misuse, with prior work claiming robustness
against paraphrasing and text editing. In this paper, we argue that existing
evaluations are not sufficiently adversarial, obscuring critical
vulnerabilities and overstating the security. To address this, we introduce
adaptive robustness radius, a formal metric that quantifies watermark
resilience against adaptive adversaries. We theoretically prove that optimizing
the attack context and model parameters can substantially reduce this radius,
making watermarks highly susceptible to paraphrase attacks. Leveraging this
insight, we propose RLCracker, a reinforcement learning (RL)-based adaptive
attack that erases watermarks while preserving semantic fidelity. RLCracker
requires only limited watermarked examples and zero access to the detector.
Despite weak supervision, it empowers a 3B model to achieve 98.5% removal
success and an average 0.92 P-SP score on 1,500-token Unigram-marked texts
after training on only 100 short samples. This performance dramatically exceeds
6.75% by GPT-4o and generalizes across five model sizes over ten watermarking
schemes. Our results confirm that adaptive attacks are broadly effective and
pose a fundamental threat to current watermarking defenses.