Federated learning has gained great attention recently as a privacy-enhancing
tool to jointly train a machine learning model by multiple parties. As a
sub-category, vertical federated learning (vFL) focuses on the scenario where
features and labels are split into different parties. The prior work on vFL has
mostly studied how to protect label privacy during model training. However,
model evaluation in vFL might also lead to potential leakage of private label
information. One mitigation strategy is to apply label differential privacy
(DP) but it gives bad estimations of the true (non-private) metrics. In this
work, we propose two evaluation algorithms that can more accurately compute the
widely used AUC (area under curve) metric when using label DP in vFL. Through
extensive experiments, we show our algorithms can achieve more accurate AUCs
compared to the baselines.