Distributed machine learning (ML) systems today use an unsophisticated threat
model: data sources must trust a central ML process. We propose a brokered
learning abstraction that allows data sources to contribute towards a
globally-shared model with provable privacy guarantees in an untrusted setting.
We realize this abstraction by building on federated learning, the state of the
art in multi-party ML, to construct TorMentor: an anonymous hidden service that
supports private multi-party ML.
We define a new threat model by characterizing, developing and evaluating new
attacks in the brokered learning setting, along with new defenses for these
attacks. We show that TorMentor effectively protects data providers against
known ML attacks while providing them with a tunable trade-off between model
accuracy and privacy. We evaluate TorMentor with local and geo-distributed
deployments on Azure/Tor. In an experiment with 200 clients and 14 MB of data
per client, our prototype trained a logistic regression model using stochastic
gradient descent in 65s.
Code is available at: https://github.com/DistributedML/TorML