The Connection between Out-of-Distribution Generalization and Privacy of ML Models

TOP Literature Database The Connection between Out-of-Distribution Generalization and Privacy of ML Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2110.03369

PDF

https://arxiv.org/pdf/2110.03369

Paper Information

Author: Divyat Mahajan;Shruti Tople;Amit Sharma
Published: 10-7-2021
Affiliation: Microsoft Research, India
Country: India
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Membership Inference Privacy Violation Robustness Evaluation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

With the goal of generalizing to out-of-distribution (OOD) data, recent domain generalization methods aim to learn "stable" feature representations whose effect on the output remains invariant across domains. Given the theoretical connection between generalization and privacy, we ask whether better OOD generalization leads to better privacy for machine learning models, where privacy is measured through robustness to membership inference (MI) attacks. In general, we find that the relationship does not hold. Through extensive evaluation on a synthetic dataset and image datasets like MNIST, Fashion-MNIST, and Chest X-rays, we show that a lower OOD generalization gap does not imply better robustness to MI attacks. Instead, privacy benefits are based on the extent to which a model captures the stable features. A model that captures stable features is more robust to MI attacks than models that exhibit better OOD generalization but do not learn stable features. Further, for the same provable differential privacy guarantees, a model that learns stable features provides higher utility as compared to others. Our results offer the first extensive empirical study connecting stable features and privacy, and also have a takeaway for the domain generalization community; MI attack can be used as a complementary metric to measure model quality.