These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The substantial investment required to develop Large Language Models (LLMs)
makes them valuable intellectual property, raising significant concerns about
copyright protection. LLM fingerprinting has emerged as a key technique to
address this, which aims to verify a model's origin by extracting an intrinsic,
unique signature (a "fingerprint") and comparing it to that of a source model
to identify illicit copies. However, existing black-box fingerprinting methods
often fail to generate distinctive LLM fingerprints. This ineffectiveness
arises because black-box methods typically rely on model outputs, which lose
critical information about the model's unique parameters due to the usage of
non-linear functions. To address this, we first leverage Fisher Information
Theory to formally demonstrate that the gradient of the model's input is a more
informative feature for fingerprinting than the output. Based on this insight,
we propose ZeroPrint, a novel method that approximates these information-rich
gradients in a black-box setting using zeroth-order estimation. ZeroPrint
overcomes the challenge of applying this to discrete text by simulating input
perturbations via semantic-preserving word substitutions. This operation allows
ZeroPrint to estimate the model's Jacobian matrix as a unique fingerprint.
Experiments on the standard benchmark show ZeroPrint achieves a
state-of-the-art effectiveness and robustness, significantly outperforming
existing black-box methods.