Federated Learning (FL) is a collaborative machine learning approach allowing
participants to jointly train a model without having to share their private,
potentially sensitive local datasets with others. Despite its benefits, FL is
vulnerable to backdoor attacks, in which an adversary injects manipulated model
updates into the model aggregation process so that the resulting model will
provide targeted false predictions for specific adversary-chosen inputs.
Proposed defenses against backdoor attacks based on detecting and filtering out
malicious model updates consider only very specific and limited attacker
models, whereas defenses based on differential privacy-inspired noise injection
significantly deteriorate the benign performance of the aggregated model. To
address these deficiencies, we introduce FLAME, a defense framework that
estimates the sufficient amount of noise to be injected to ensure the
elimination of backdoors while maintaining the model performance. To minimize
the required amount of noise, FLAME uses a model clustering and weight clipping
approach. Our evaluation of FLAME on several datasets stemming from application
areas including image classification, word prediction, and IoT intrusion
detection demonstrates that FLAME removes backdoors effectively with a
negligible impact on the benign performance of the models. Furthermore,
following the considerable attention that our research has received after its
presentation at USENIX SEC 2022, FLAME has become the subject of numerous
investigations proposing diverse attack methodologies in an attempt to
circumvent it. As a response to these endeavors, we provide a comprehensive
analysis of these attempts. Our findings show that these papers (e.g., 3DFed
[36]) have not fully comprehended nor correctly employed the fundamental
principles underlying FLAME, i.e., our defense mechanism effectively repels
these attempted attacks.