DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection

Authors: Zhenhua Xu, Yiran Zhao, Mengting Zhong, Dezhang Kong, Changting Lin, Tong Qiao, Meng Han | Published: 2026-01-13

2026.01.13

Authors: Zhenhua Xu, Yiran Zhao, Mengting Zhong, Dezhang Kong, Changting Lin, Tong Qiao, Meng Han
Published: 2026-01-13

Source: https://arxiv.org/abs/2601.08223

PDF: https://arxiv.org/pdf/2601.08223

AIにより推定されたラベル

透かし技術の堅牢性フィンガープリンティング手法プライバシー保護

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare tokens – leading to high-perplexity inputs susceptible to filtering – or use fixed trigger-response mappings that are brittle to leakage and post-hoc adaptation. We propose Dual-Layer Nested Fingerprinting (DNF), a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers. Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility. Compared with existing methods, it uses lower-perplexity triggers, remains undetectable under fingerprint detection attacks, and is relatively robust to incremental fine-tuning and model merging. These results position DNF as a practical, stealthy, and resilient solution for LLM ownership verification and intellectual property protection.