In real-world settings involving consequential decision-making, the
deployment of machine learning systems generally requires both reliable
uncertainty quantification and protection of individuals' privacy. We present a
framework that treats these two desiderata jointly. Our framework is based on
conformal prediction, a methodology that augments predictive models to return
prediction sets that provide uncertainty quantification -- they provably cover
the true response with a user-specified probability, such as 90%. One might
hope that when used with privately-trained models, conformal prediction would
yield privacy guarantees for the resulting prediction sets; unfortunately, this
is not the case. To remedy this key problem, we develop a method that takes any
pre-trained predictive model and outputs differentially private prediction
sets. Our method follows the general approach of split conformal prediction; we
use holdout data to calibrate the size of the prediction sets but preserve
privacy by using a privatized quantile subroutine. This subroutine compensates
for the noise introduced to preserve privacy in order to guarantee correct
coverage. We evaluate the method on large-scale computer vision datasets.
外部データセット
CIFAR-10
ImageNet
CoronaHack
参考文献
bioRxiv
Databiology Lab CORONAHACK: Collection of public COVID-19 data
J. C. Perez, C. de Blas Perez, F. L. Alvarez, J. M. C. Contreras
Published: 2020
Machine Learning: European Conference on Machine Learning
Inductive confidence machines for regression
H. Papadopoulos, K. Proedrou, V. Vovk, A. Gammerman
Published: 2002
Springer
Algorithmic Learning in a Random World
V. Vovk, A. Gammerman, G. Shafer
Published: 2005
Journal of the American Statistical Association
Distribution-free predictive inference for regression
J. Lei, M. G’Sell, A. Rinaldo, R. J. Tibshirani, L. Wasserman
Published: 2018
Theory of Cryptography
Calibrating noise to sensitivity in private data analysis
Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith