Machine Learning with Privacy for Protected Attributes

TOP Literature Database Machine Learning with Privacy for Protected Attributes

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2506.19836

PDF

https://arxiv.org/pdf/2506.19836

Paper Information

Author: Saeed Mahloujifar,Chuan Guo,G. Edward Suh,Kamalika Chaudhuri
Published: 6-25-2025
Affiliation: FAIR at Meta
Country: United States of America
Conference: SP

Labels Estimated by AI

Differential Privacy Privacy-Preserving Data Mining Privacy and Optimization

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Differential privacy (DP) has become the standard for private data analysis. Certain machine learning applications only require privacy protection for specific protected attributes. Using naive variants of differential privacy in such use cases can result in unnecessary degradation of utility. In this work, we refine the definition of DP to create a more general and flexible framework that we call feature differential privacy (FDP). Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary and adaptive separation of protected and non-protected features. We prove the properties of FDP, such as adaptive composition, and demonstrate its implications for limiting attribute inference attacks. We also propose a modification of the standard DP-SGD algorithm that satisfies FDP while leveraging desirable properties such as amplification via sub-sampling. We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available. For example, we train diffusion models on the AFHQ dataset of animal faces and observe a drastic improvement in FID compared to DP, from 286.7 to 101.9 at $\epsilon=8$, assuming that the blurred version of a training image is available as a public feature. Overall, our work provides a new approach to private data analysis that can help reduce the utility cost of DP while still providing strong privacy guarantees.

External Datasets

AFHQ