We present Zeno, a technique to make distributed machine learning,
particularly Stochastic Gradient Descent (SGD), tolerant to an arbitrary number
of faulty workers. Zeno generalizes previous results that assumed a majority of
non-faulty nodes; we need assume only one non-faulty worker. Our key idea is to
suspect workers that are potentially defective. Since this is likely to lead to
false positives, we use a ranking-based preference mechanism. We prove the
convergence of SGD for non-convex problems under these scenarios. Experimental
results show that Zeno outperforms existing approaches.