Marketplaces for machine learning (ML) models are emerging as a way for
organizations to monetize models. They allow model owners to retain control
over hosted models by using cloud resources to execute ML inference requests
for a fee, preserving model confidentiality. Clients that rely on hosted models
require trustworthy inference results, even when models are managed by third
parties. While the resilience and robustness of inference results can be
improved by combining multiple independent models, such support is unavailable
in today's marketplaces.
We describe Dropbear, the first ML model marketplace that provides clients
with strong integrity guarantees by combining results from multiple models in a
trustworthy fashion. Dropbear replicates inference computation across a model
group, which consists of multiple cloud-based GPU nodes belonging to different
model owners. Clients receive inference certificates that prove agreement using
a Byzantine consensus protocol, even under model heterogeneity and concurrent
model updates. To improve performance, Dropbear batches inference and consensus
operations separately: it first performs the inference computation across a
model group, before ordering requests and model updates. Despite its strong
integrity guarantees, Dropbear's performance matches that of state-of-the-art
ML inference systems: deployed across 3 cloud sites, it handles 800 requests/s
with ImageNet models.