Generative machine learning models are being increasingly viewed as a way to
share sensitive data between institutions. While there has been work on
developing differentially private generative modeling approaches, these
approaches generally lead to sub-par sample quality, limiting their use in real
world applications. Another line of work has focused on developing generative
models which lead to higher quality samples but currently lack any formal
privacy guarantees. In this work, we propose the first formal framework for
membership privacy estimation in generative models. We formulate the membership
privacy risk as a statistical divergence between training samples and hold-out
samples, and propose sample-based methods to estimate this divergence. Compared
to previous works, our framework makes more realistic and flexible assumptions.
First, we offer a generalizable metric as an alternative to the accuracy metric
especially for imbalanced datasets. Second, we loosen the assumption of having
full access to the underlying distribution from previous studies , and propose
sample-based estimations with theoretical guarantees. Third, along with the
population-level membership privacy risk estimation via the optimal membership
advantage, we offer the individual-level estimation via the individual privacy
risk. Fourth, our framework allows adversaries to access the trained model via
a customized query, while prior works require specific attributes.