These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Despite providing superior performance, open-source large language models
(LLMs) are vulnerable to abusive usage. To address this issue, recent works
propose LLM fingerprinting methods to identify the specific source LLMs behind
suspect applications. However, these methods fail to provide stealthy and
robust fingerprint verification. In this paper, we propose a novel LLM
fingerprinting scheme, namely CoTSRF, which utilizes the Chain of Thought (CoT)
as the fingerprint of an LLM. CoTSRF first collects the responses from the
source LLM by querying it with crafted CoT queries. Then, it applies
contrastive learning to train a CoT extractor that extracts the CoT feature
(i.e., fingerprint) from the responses. Finally, CoTSRF conducts fingerprint
verification by comparing the Kullback-Leibler divergence between the CoT
features of the source and suspect LLMs against an empirical threshold. Various
experiments have been conducted to demonstrate the advantage of our proposed
CoTSRF for fingerprinting LLMs, particularly in stealthy and robust fingerprint
verification.