LLMs can hide text in other text of the same length

Authors: Antonio Norelli, Michael Bronstein | Published: 2025-10-22 | Updated: 2025-10-27

2025.10.22

Authors: Antonio Norelli, Michael Bronstein
Published: 2025-10-22 | Updated: 2025-10-27

Source: https://arxiv.org/abs/2510.20075

PDF: https://arxiv.org/pdf/2510.20075

AIにより推定されたラベル

プライバシー保護プロンプトの検証教育目的の情報提供

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

A meaningful text can be hidden inside another, completely different yet still coherent and plausible, text of the same length. For example, a tweet containing a harsh political critique could be embedded in a tweet that celebrates the same political leader, or an ordinary product review could conceal a secret manuscript. This uncanny state of affairs is now possible thanks to Large Language Models, and in this paper we present a simple and efficient protocol to achieve it. We show that even modest 8-billion-parameter open-source LLMs are sufficient to obtain high-quality results, and a message as long as this abstract can be encoded and decoded locally on a laptop in seconds. The existence of such a protocol demonstrates a radical decoupling of text from authorial intent, further eroding trust in written communication, already shaken by the rise of LLM chatbots. We illustrate this with a concrete scenario: a company could covertly deploy an unfiltered LLM by encoding its answers within the compliant responses of a safe model. This possibility raises urgent questions for AI safety and challenges our understanding of what it means for a Large Language Model to know something.