These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Vertical federated learning (VFL) enables multiple parties with disjoint
features of a common user set to train a machine learning model without sharing
their private data. Tree-based models have become prevalent in VFL due to their
interpretability and efficiency. However, the vulnerability of tree-based VFL
has not been sufficiently investigated. In this study, we first introduce a
novel label inference attack, ID2Graph, which utilizes the sets of record IDs
assigned to each node (i.e., instance space)to deduce private training labels.
ID2Graph attack generates a graph structure from training samples, extracts
communities from the graph, and clusters the local dataset using community
information. To counteract label leakage from the instance space, we propose
two effective defense mechanisms, Grafting-LDP, which improves the utility of
label differential privacy with post-processing, and andID-LMID, which focuses
on mutual information regularization. Comprehensive experiments on various
datasets reveal that ID2Graph presents significant risks to tree-based models
such as RandomForest and XGBoost. Further evaluations of these benchmarks
demonstrate that our defense methods effectively mitigate label leakage in such
instances