These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
In a world of increasing closed-source commercial machine learning models,
model evaluations from developers must be taken at face value. These benchmark
results-whether over task accuracy, bias evaluations, or safety checks-are
traditionally impossible to verify by a model end-user without the costly or
impossible process of re-performing the benchmark on black-box model outputs.
This work presents a method of verifiable model evaluation using model
inference through zkSNARKs. The resulting zero-knowledge computational proofs
of model outputs over datasets can be packaged into verifiable evaluation
attestations showing that models with fixed private weights achieve stated
performance or fairness metrics over public inputs. We present a flexible
proving system that enables verifiable attestations to be performed on any
standard neural network model with varying compute requirements. For the first
time, we demonstrate this across a sample of real-world models and highlight
key challenges and design solutions. This presents a new transparency paradigm
in the verifiable evaluation of private models.