Organizations are increasingly recognizing the value of data collaboration
for data analytics purposes. Yet, stringent data protection laws prohibit the
direct exchange of raw data. To facilitate data collaboration, federated
Learning (FL) emerges as a viable solution, which enables multiple clients to
collaboratively train a machine learning (ML) model under the supervision of a
central server while ensuring the confidentiality of their raw data. However,
existing studies have unveiled two main risks: (i) the potential for the server
to infer sensitive information from the client's uploaded updates (i.e., model
gradients), compromising client input privacy, and (ii) the risk of malicious
clients uploading malformed updates to poison the FL model, compromising input
integrity. Recent works utilize secure aggregation with zero-knowledge proofs
(ZKP) to guarantee input privacy and integrity in FL. Nevertheless, they suffer
from extremely low efficiency and, thus, are impractical for real deployment.
In this paper, we propose a novel and highly efficient solution RiseFL for
secure and verifiable data collaboration, ensuring input privacy and integrity
simultaneously.Firstly, we devise a probabilistic integrity check method that
significantly reduces the cost of ZKP generation and verification. Secondly, we
design a hybrid commitment scheme to satisfy Byzantine robustness with improved
performance. Thirdly, we theoretically prove the security guarantee of the
proposed solution. Extensive experiments on synthetic and real-world datasets
suggest that our solution is effective and is highly efficient in both client
computation and communication. For instance, RiseFL is up to 28x, 53x and 164x
faster than three state-of-the-art baselines ACORN, RoFL and EIFFeL for the
client computation.