These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
With the goal of generalizing to out-of-distribution (OOD) data, recent
domain generalization methods aim to learn "stable" feature representations
whose effect on the output remains invariant across domains. Given the
theoretical connection between generalization and privacy, we ask whether
better OOD generalization leads to better privacy for machine learning models,
where privacy is measured through robustness to membership inference (MI)
attacks. In general, we find that the relationship does not hold. Through
extensive evaluation on a synthetic dataset and image datasets like MNIST,
Fashion-MNIST, and Chest X-rays, we show that a lower OOD generalization gap
does not imply better robustness to MI attacks. Instead, privacy benefits are
based on the extent to which a model captures the stable features. A model that
captures stable features is more robust to MI attacks than models that exhibit
better OOD generalization but do not learn stable features. Further, for the
same provable differential privacy guarantees, a model that learns stable
features provides higher utility as compared to others. Our results offer the
first extensive empirical study connecting stable features and privacy, and
also have a takeaway for the domain generalization community; MI attack can be
used as a complementary metric to measure model quality.