Privacy and Byzantine resilience (BR) are two crucial requirements of
modern-day distributed machine learning. The two concepts have been extensively
studied individually but the question of how to combine them effectively
remains unanswered. This paper contributes to addressing this question by
studying the extent to which the distributed SGD algorithm, in the standard
parameter-server architecture, can learn an accurate model despite (a) a
fraction of the workers being malicious (Byzantine), and (b) the other
fraction, whilst being honest, providing noisy information to the server to
ensure differential privacy (DP). We first observe that the integration of
standard practices in DP and BR is not straightforward. In fact, we show that
many existing results on the convergence of distributed SGD under Byzantine
faults, especially those relying on $(\alpha,f)$-Byzantine resilience, are
rendered invalid when honest workers enforce DP. To circumvent this
shortcoming, we revisit the theory of $(\alpha,f)$-BR to obtain an approximate
convergence guarantee. Our analysis provides key insights on how to improve
this guarantee through hyperparameter optimization. Essentially, our
theoretical and empirical results show that (1) an imprudent combination of
standard approaches to DP and BR might be fruitless, but (2) by carefully
re-tuning the learning algorithm, we can obtain reasonable learning accuracy
while simultaneously guaranteeing DP and BR.