Abstract
The recent surge in artificial intelligence (AI), characterized by the
prominence of large language models (LLMs), has ushered in fundamental
transformations across the globe. However, alongside these advancements,
concerns surrounding the legitimacy of LLMs have grown, posing legal challenges
to their extensive applications. Compounding these concerns, the parameters of
LLMs are often treated as intellectual property, restricting direct
investigations.
In this study, we address a fundamental challenge within the realm of AI
legislation: the need to establish the authenticity of outputs generated by
LLMs. To tackle this issue, we present zkLLM, which stands as the inaugural
specialized zero-knowledge proof tailored for LLMs to the best of our
knowledge. Addressing the persistent challenge of non-arithmetic operations in
deep learning, we introduce tlookup, a parallelized lookup argument designed
for non-arithmetic tensor operations in deep learning, offering a solution with
no asymptotic overhead. Furthermore, leveraging the foundation of tlookup, we
introduce zkAttn, a specialized zero-knowledge proof crafted for the attention
mechanism, carefully balancing considerations of running time, memory usage,
and accuracy.
Empowered by our fully parallelized CUDA implementation, zkLLM emerges as a
significant stride towards achieving efficient zero-knowledge verifiable
computations over LLMs. Remarkably, for LLMs boasting 13 billion parameters,
our approach enables the generation of a correctness proof for the entire
inference process in under 15 minutes. The resulting proof, compactly sized at
less than 200 kB, is designed to uphold the privacy of the model parameters,
ensuring no inadvertent information leakage.