Machine Learning with Privacy for Protected Attributes

Authors: Saeed Mahloujifar, Chuan Guo, G. Edward Suh, Kamalika Chaudhuri | Published: 2025-06-24

2025.06.24

Authors: Saeed Mahloujifar, Chuan Guo, G. Edward Suh, Kamalika Chaudhuri
Published: 2025-06-24

Source: https://arxiv.org/abs/2506.19836

PDF: https://arxiv.org/pdf/2506.19836

AIにより推定されたラベル

差分プライバシープライバシー保護データマイニングプライバシーと最適化

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Differential privacy (DP) has become the standard for private data analysis. Certain machine learning applications only require privacy protection for specific protected attributes. Using naive variants of differential privacy in such use cases can result in unnecessary degradation of utility. In this work, we refine the definition of DP to create a more general and flexible framework that we call feature differential privacy (FDP). Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary and adaptive separation of protected and non-protected features. We prove the properties of FDP, such as adaptive composition, and demonstrate its implications for limiting attribute inference attacks. We also propose a modification of the standard DP-SGD algorithm that satisfies FDP while leveraging desirable properties such as amplification via sub-sampling. We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available. For example, we train diffusion models on the AFHQ dataset of animal faces and observe a drastic improvement in FID compared to DP, from 286.7 to 101.9 at ϵ = 8, assuming that the blurred version of a training image is available as a public feature. Overall, our work provides a new approach to private data analysis that can help reduce the utility cost of DP while still providing strong privacy guarantees.