Federated learning systems are vulnerable to attacks from malicious clients.
As the central server in the system cannot govern the behaviors of the clients,
a rogue client may initiate an attack by sending malicious model updates to the
server, so as to degrade the learning performance or enforce targeted model
poisoning attacks (a.k.a. backdoor attacks). Therefore, timely detecting these
malicious model updates and the underlying attackers becomes critically
important. In this work, we propose a new framework for robust federated
learning where the central server learns to detect and remove the malicious
model updates using a powerful detection model, leading to targeted defense. We
evaluate our solution in both image classification and sentiment analysis tasks
with a variety of machine learning models. Experimental results show that our
solution ensures robust federated learning that is resilient to both the
Byzantine attacks and the targeted model poisoning attacks.