These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Differential privacy (DP) is a widely used approach for mitigating privacy
risks when training machine learning models on sensitive data. DP mechanisms
add noise during training to limit the risk of information leakage. The scale
of the added noise is critical, as it determines the trade-off between privacy
and utility. The standard practice is to select the noise scale to satisfy a
given privacy budget $\varepsilon$. This privacy budget is in turn interpreted
in terms of operational attack risks, such as accuracy, sensitivity, and
specificity of inference attacks aimed to recover information about the
training data records. We show that first calibrating the noise scale to a
privacy budget $\varepsilon$, and then translating {\epsilon} to attack risk
leads to overly conservative risk assessments and unnecessarily low utility.
Instead, we propose methods to directly calibrate the noise scale to a desired
attack risk level, bypassing the step of choosing $\varepsilon$. For a given
notion of attack risk, our approach significantly decreases noise scale,
leading to increased utility at the same level of privacy. We empirically
demonstrate that calibrating noise to attack sensitivity/specificity, rather
than $\varepsilon$, when training privacy-preserving ML models substantially
improves model accuracy for the same risk level. Our work provides a principled
and practical way to improve the utility of privacy-preserving ML without
compromising on privacy. The code is available at
https://github.com/Felipe-Gomez/riskcal