CADRE: Customizable Assurance of Data Readiness in Privacy-Preserving Federated Learning

Authors: Kaveen Hiniduma, Zilinghan Li, Aditya Sinha, Ravi Madduri, Suren Byna | Published: 2025-05-28 | Updated: 2025-08-11

2025.05.282025.08.13

Authors: Kaveen Hiniduma, Zilinghan Li, Aditya Sinha, Ravi Madduri, Suren Byna
Published: 2025-05-28 | Updated: 2025-08-11

Source: https://arxiv.org/abs/2505.23849

PDF: https://arxiv.org/pdf/2505.23849

Labels Predicted by AI

Data Management System Framework Imbalanced Dataset

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Privacy-Preserving Federated Learning (PPFL) is a decentralized machine learning approach where multiple clients train a model collaboratively. PPFL preserves the privacy and security of a client’s data without exchanging it. However, ensuring that data at each client is of high quality and ready for federated learning (FL) is a challenge due to restricted data access. In this paper, we introduce CADRE (Customizable Assurance of Data Readiness) for federated learning (FL), a novel framework that allows users to define custom data readiness (DR) metrics, rules, and remedies tailored to specific FL tasks. CADRE generates comprehensive DR reports based on the user-defined metrics, rules, and remedies to ensure datasets are prepared for FL while preserving privacy. We demonstrate a practical application of CADRE by integrating it into an existing PPFL framework. We conducted experiments across six datasets and addressed seven different DR issues. The results illustrate the versatility and effectiveness of CADRE in ensuring DR across various dimensions, including data quality, privacy, and fairness. This approach enhances the performance and reliability of FL models as well as utilizes valuable resources.