These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Federated Learning (FL) is a widespread approach that allows training machine
learning (ML) models with data distributed across multiple devices. In
cross-silo FL, which often appears in domains like healthcare or finance, the
number of participants is moderate, and each party typically represents a
well-known organization. For instance, in medicine data owners are often
hospitals or data hubs which are well-established entities. However, malicious
parties may still attempt to disturb the training procedure in order to obtain
certain benefits, for example, a biased result or a reduction in computational
load. While one can easily detect a malicious agent when data used for training
is public, the problem becomes much more acute when it is necessary to maintain
the privacy of the training dataset. To address this issue, there is recently
growing interest in developing verifiable protocols, where one can check that
parties do not deviate from the training procedure and perform computations
correctly. In this paper, we present a systematization of knowledge on
verifiable cross-silo FL. We analyze various protocols, fit them in a taxonomy,
and compare their efficiency and threat models. We also analyze Zero-Knowledge
Proof (ZKP) schemes and discuss how their overall cost in a FL context can be
minimized. Lastly, we identify research gaps and discuss potential directions
for future scientific work.