BlindFL: Vertical Federated Machine Learning without Peeking into Your Data

TOP Literature Database BlindFL: Vertical Federated Machine Learning without Peeking into Your Data

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2206.07975

PDF

https://arxiv.org/pdf/2206.07975

Paper Information

Author: Fangcheng Fu;Huanran Xue;Yong Cheng;Yangyu Tao;Bin Cui
Published: 6-16-2022
Affiliation: Peking University
Country: China
Conference: SIGMOD Conference

Labels Estimated by AI

Multi-Party Computation Privacy Enhancing Protocol Algorithm

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Due to the rising concerns on privacy protection, how to build machine learning (ML) models over different data sources with security guarantees is gaining more popularity. Vertical federated learning (VFL) describes such a case where ML models are built upon the private data of different participated parties that own disjoint features for the same set of instances, which fits many real-world collaborative tasks. Nevertheless, we find that existing solutions for VFL either support limited kinds of input features or suffer from potential data leakage during the federated execution. To this end, this paper aims to investigate both the functionality and security of ML modes in the VFL scenario. To be specific, we introduce BlindFL, a novel framework for VFL training and inference. First, to address the functionality of VFL models, we propose the federated source layers to unite the data from different parties. Various kinds of features can be supported efficiently by the federated source layers, including dense, sparse, numerical, and categorical features. Second, we carefully analyze the security during the federated execution and formalize the privacy requirements. Based on the analysis, we devise secure and accurate algorithm protocols, and further prove the security guarantees under the ideal-real simulation paradigm. Extensive experiments show that BlindFL supports diverse datasets and models efficiently whilst achieves robust privacy guarantees.

External Datasets

a9a

w8a

connect-4

news20

higgs

avazu-app

industry