The substantial investment required to develop Large Language Models (LLMs)
makes them valuable intellectual property, raising significant concerns about
copyright protection. LLM fingerprinting has emerged as a key technique to
address this, which aims to verify a model's origin by extracting an intrinsic,
unique signature (a "fingerprint") and comparing it to that of a source model
to identify illicit copies. However, existing black-box fingerprinting methods
often fail to generate distinctive LLM fingerprints. This ineffectiveness
arises because black-box methods typically rely on model outputs, which lose
critical information about the model's unique parameters due to the usage of
non-linear functions. To address this, we first leverage Fisher Information
Theory to formally demonstrate that the gradient of the model's input is a more
informative feature for fingerprinting than the output. Based on this insight,
we propose ZeroPrint, a novel method that approximates these information-rich
gradients in a black-box setting using zeroth-order estimation. ZeroPrint
overcomes the challenge of applying this to discrete text by simulating input
perturbations via semantic-preserving word substitutions. This operation allows
ZeroPrint to estimate the model's Jacobian matrix as a unique fingerprint.
Experiments on the standard benchmark show ZeroPrint achieves a
state-of-the-art effectiveness and robustness, significantly outperforming
existing black-box methods.