These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Federated learning (FL) has attracted growing interest for enabling
privacy-preserving machine learning on data stored at multiple users while
avoiding moving the data off-device. However, while data never leaves users'
devices, privacy still cannot be guaranteed since significant computations on
users' training data are shared in the form of trained local models. These
local models have recently been shown to pose a substantial privacy threat
through different privacy attacks such as model inversion attacks. As a remedy,
Secure Aggregation (SA) has been developed as a framework to preserve privacy
in FL, by guaranteeing the server can only learn the global aggregated model
update but not the individual model updates. While SA ensures no additional
information is leaked about the individual model update beyond the aggregated
model update, there are no formal guarantees on how much privacy FL with SA can
actually offer; as information about the individual dataset can still
potentially leak through the aggregated model computed at the server. In this
work, we perform a first analysis of the formal privacy guarantees for FL with
SA. Specifically, we use Mutual Information (MI) as a quantification metric and
derive upper bounds on how much information about each user's dataset can leak
through the aggregated model update. When using the FedSGD aggregation
algorithm, our theoretical bounds show that the amount of privacy leakage
reduces linearly with the number of users participating in FL with SA. To
validate our theoretical bounds, we use an MI Neural Estimator to empirically
evaluate the privacy leakage under different FL setups on both the MNIST and
CIFAR10 datasets. Our experiments verify our theoretical bounds for FedSGD,
which show a reduction in privacy leakage as the number of users and local
batch size grow, and an increase in privacy leakage with the number of training
rounds.