For small privacy parameter $\epsilon$, $\epsilon$-differential privacy (DP)
provides a strong worst-case guarantee that no membership inference attack
(MIA) can succeed at determining whether a person's data was used to train a
machine learning model. The guarantee of DP is worst-case because: a) it holds
even if the attacker already knows the records of all but one person in the
data set; and b) it holds uniformly over all data sets. In practical
applications, such a worst-case guarantee may be overkill: practical attackers
may lack exact knowledge of (nearly all of) the private data, and our data set
might be easier to defend, in some sense, than the worst-case data set. Such
considerations have motivated the industrial deployment of DP models with large
privacy parameter (e.g. $\epsilon \geq 7$), and it has been observed
empirically that DP with large $\epsilon$ can successfully defend against
state-of-the-art MIAs. Existing DP theory cannot explain these empirical
findings: e.g., the theoretical privacy guarantees of $\epsilon \geq 7$ are
essentially vacuous. In this paper, we aim to close this gap between theory and
practice and understand why a large DP parameter can prevent practical MIAs. To
tackle this problem, we propose a new privacy notion called practical
membership privacy (PMP). PMP models a practical attacker's uncertainty about
the contents of the private data. The PMP parameter has a natural
interpretation in terms of the success rate of a practical MIA on a given data
set. We quantitatively analyze the PMP parameter of two fundamental DP
mechanisms: the exponential mechanism and Gaussian mechanism. Our analysis
reveals that a large DP parameter often translates into a much smaller PMP
parameter, which guarantees strong privacy against practical MIAs. Using our
findings, we offer principled guidance for practitioners in choosing the DP
parameter.
外部データセット
MNIST
X
参考文献
Differential Privacy Overview
Apple
Published: 2016
International Conference on Machine Learning
Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising
Borja Balle, Yu-Xiang Wang
Published: 2018
2013 IEEE 54th Annual Symposium on Foundations of Computer Science
Coupled-worlds privacy: Exploiting adversarial uncertainty in statistical data privacy
Bassily, R., Groce, A., Katz, J., Smith, A.
Published: 2013
Advances in Cryptology–ASIACRYPT 2011
Noiseless database privacy
Bhaskar, R., Bhowmick, A., Goyal, V., Laxman, S., Thakurta, A.
Published: 2011
2022 IEEE Symposium on Security and Privacy (SP)
Membership inference attacks from first principles
Carlini, N., Chien, S., Nasr, M., Song, S., Terzis, A., Tramer, F.
Published: 2022
USENIX Security Symposium
Extracting Training Data from Large Language Models
Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T. B., Song, D., Erlingsson, U.