Differentially Private Post-Processing for Fair Regression

Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

Deep Learning with Differential Privacy

Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang

Published: 2016

Proceedings of the 32nd International Conference on Algorithmic Learning Theory

On the Sample Complexity of Privately Learning Unbounded High-Dimensional Gaussians

Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath

Published: 2021

International Conference on Machine Learning

A reductions approach to fair classification

Alekh Agarwal, Alina Beygelzimer, Miroslav Dudik, John Langford, Hanna Wallach

Published: 2018

Proceedings of the 36th International Conference on Machine Learning

Fair Regression: Quantitative Definitions and Reduction-Based Algorithms

Alekh Agarwal, Miroslav Dudík, Zhiwei Steven Wu

Published: 2019

IJCAI 2020 Workshop on AI for Social Good

Trade-Offs between Fairness and Privacy in Machine Learning

Sushant Agarwal

Published: 2020

arxiv

Cited by 1

Proc. Priv. Enhancing Technol.

Differentially Private Simple Linear Regression

Daniel Alabi, Audra McMillan, Jayshree Sarathy, Adam Smith, Salil Vadhan

Published: 7.10.2020

Economics and social science research often require analyzing datasets of sensitive personal information at fine granularity, with models fit to small subsets of the data. Unfortunately, such fine-grained analysis can easily reveal sensitive individual information. We study algorithms for simple linear regression that satisfy differential privacy, a constraint which guarantees that an algorithm's output reveals little about any individual input data record, even to an attacker with arbitrary side information about the dataset. We consider the design of differentially private algorithms for simple linear regression for small datasets, with tens to hundreds of datapoints, which is a particularly challenging regime for differential privacy. Focusing on a particular application to small-area analysis in economics research, we study the performance of a spectrum of algorithms we adapt to the setting. We identify key factors that affect their performance, showing through a range of experiments that algorithms based on robust estimators (in particular, the Theil-Sen estimator) perform well on the smallest datasets, but that other more standard algorithms do better as the dataset size increases.

Privacy Assessment Computational Efficiency Hyperparameter Tuning

Journal of Machine Learning Research

Wasserstein Barycenters can be Computed in Polynomial Time in Fixed Dimension

Jason M. Altschuler, Enric Boix-Adserà

Published: 2021

Mathematical Methods of Operations Research

Discrete Wasserstein Barycenters: Optimal Transport for Discrete Data

Ethan Anderes, Steffen Borgwardt, Jacob Miller

Published: 2016

Advances in neural information processing systems

Differential privacy has disparate impact on model accuracy

B.Eugene, P.Omid, S.Vitaly

Published: 2019

California Law Review

Big Data’s Disparate Impact

Solon Baracas, Andrew D. Selbst

Published: 2016

The MIT Press

Fairness and Machine Learning: Limitations and Opportunities

Solon Baracas, Moritz Hardt, Arvind Narayanan

Published: 2023

IEEE Annual Symposium on Foundations of Computer Science

Private empirical risk minimization: Efficient algorithms and tight error bounds

R. Bassily, A. Smith, A. Thakurta

Published: 2014

Sociological Methods & Research

Fairness in Criminal Justice Risk Assessments: The State of the Art

Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, Aaron Roth

Published: 2021

Advances in Neural Information Processing Systems

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.

Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai

Published: 2016

Advances in Neural Information Processing Systems

Private Hypothesis Selection

Mark Bun, Gautam Kamath, Thomas Steinke, Steven Zhiwei Wu

Published: 2019

Proceedings of the 2018 Conference on Fairness, Accountability, and Transparency

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Joy Buolamwini, Timnit Gebru

Published: 2018

2009 IEEE International Conference on Data Mining Workshops

Building Classifiers with Independency Constraints

Toon Calders, Faisal Kamiran, Mykola Pechenizkiy

Published: 2009

arxiv

Cited by 1

Optimized Data Pre-Processing for Discrimination Prevention

Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

Published: 4.12.2017

Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the impact of limited sample size in accomplishing this objective, and apply two instances of the proposed optimization to datasets, including one on real-world criminal recidivism. The results demonstrate that all three criteria can be simultaneously achieved and also reveal interesting patterns of bias in American society.

Creation of Fair AI Models Fairness Learning Bias in Training Data

Journal of Machine Learning Research

Differentially private empirical risk minimization

K. Chaudhuri, C. Monteleoni, A. D. Sarwate

Published: 2011

Proceedings of the 38th International Conference on Machine Learning

Understanding and Mitigating Accuracy Disparity in Regression

Jianfeng Chi, Yuan Tian, Geoffrey J. Gordon, Han Zhao

Published: 2021

Advances in Neural Information Processing Systems

Faster Wasserstein Distance Estimation with the Sinkhorn Divergence

Lénaïc Chizat, Pierre Roussillon, Flavien Léger, François-Xavier Vialard, Gabriel Peyré

Published: 2020

Big Data

Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments

Alexandra Chouldechova

Published: 2017

Advances in Neural Information Processing Systems

Fair Regression with Wasserstein Barycenters

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, Massimiliano Pontil

Published: 2020

Unbiased Statistical Estimation and Valid Confidence Intervals Under Differential Privacy

Christian Covington, Xi He, James Honaker, Gautam Kamath

Published: 2021

Conference on User Modeling, Adaptation and Personalization

On the compatibility of privacy and fairness

Rachel Cummings, Varun Gupta, Dhamma Kimpara, Jamie Morgenstern

Published: 2019

Proceedings of the 31st International Conference on Machine Learning

Fast Computation of Wasserstein Barycenters

Marco Cuturi, Arnaud Doucet

Published: 2014

Advances in Neural Information Processing Systems

Differentially Private Learning of Structured Discrete Distributions

Ilias Diakonikolas, Moritz Hardt, Ludwig Schmidt

Published: 2015

Foundations and Trends in Theoretical Computer Science

The Algorithmic Foundations of Differential Privacy

Cynthia Dwork, Aaron Roth

Published: 2014

Theory of Cryptography

Calibrating noise to sensitivity in private data analysis

Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith

Published: 2006

Proceedings of the 3rd Innovations in Theoretical Computer Science Conference

Fairness Through Awareness

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, Richard Zemel

Published: 2012

ACM Computing Surveys

Decision Tree Classification with Differential Privacy: A Survey

Sam Fletcher, Md. Zahidul Islam

Published: 2019

Proceedings of the Forty-Second ACM Symposium on Theory of Computing

On the Geometry of Differential Privacy

Moritz Hardt, Kunal Talwar

Published: 2010

arxiv

Cited by 1

NIPS

Equality of Opportunity in Supervised Learning

Moritz Hardt, Eric Price, Nathan Srebro

Published: 10.8.2016

We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy. In line with other studies, our notion is oblivious: it depends only on the joint statistics of the predictor, the target and the protected attribute, but not on interpretation of individualfeatures. We study the inherent limits of defining and identifying biases based on such oblivious measures, outlining what can and cannot be inferred from different oblivious tests. We illustrate our notion using a case study of FICO credit scores.

Measurement of Observational Fairness Creation of Fair AI Models Model Identification

Proceedings of the 35th International Conference on Machine Learning

Multicalibration: Calibration for the (Computationally-Identifiable) Masses

Úrsula Hébert-Johnson, Michael P. Kim, Omer Reingold, Guy N. Rothblum

Published: 2018

Proceedings of the 36th International Conference on Machine Learning

Differentially Private Fair Learning

Matthew Jagielski, Michael Kearns, Jieming Mao, Alina Oprea, Aaron Roth, Saeed Sharifi-Malvajerdi, Jonathan Ullman

Published: 2019

Machine Learning and Knowledge Discovery in Databases

Fairness-Aware Classifier with Prejudice Remover Regularizer

Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, Jun Sakuma

Published: 2012

Proceedings of the 25th Annual Conference on Learning Theory

Private Convex Empirical Risk Minimization and High-dimensional Regression

Daniel Kifer, Adam Smith, Abhradeep Thakurta

Published: 2012

Projection to Fairness in Statistical Learning

Thibaut Le Gouic, Jean-Michel Loubes, Philippe Rigollet

Published: 2020

High Dimensional Probability IX

Tail Bounds for Sums of Independent Two-Sided Exponential Random Variables

Jiawei Li, Tomasz Tkocz

Published: 2023

arxiv

Cited by 1

Private Selection from Private Candidates

Jingcheng Liu, Kunal Talwar

Published: 11.20.2018

Differentially Private algorithms often need to select the best amongst many candidate options. Classical works on this selection problem require that the candidates' goodness, measured as a real-valued score function, does not change by much when one person's data changes. In many applications such as hyperparameter optimization, this stability assumption is much too strong. In this work, we consider the selection problem under a much weaker stability assumption on the candidates, namely that the score functions are differentially private. Under this assumption, we present algorithms that are near-optimal along the three relevant dimensions: privacy, utility and computational efficiency. Our result can be seen as a generalization of the exponential mechanism and its existing generalizations. We also develop an online version of our algorithm, that can be seen as a generalization of the sparse vector technique to this weaker stability assumption. We show how our results imply better algorithms for hyperparameter selection in differentially private machine learning, as well as for adaptive data analysis.

Differential Privacy Statistical Analysis Selection and Evaluation of Optimization Algorithms

IEEE 30th Computer Security Foundations Symposium

Rényi Differential Privacy

Ilya Mironov

Published: 2017

arxiv

Cited by 1

AAAI Conference on Artificial Intelligence (AAAI)

The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection

Shubhankar Mohapatra, Sajin Sasy, Xi He, Gautam Kamath, Om Thakkar

Published: 11.9.2021

Hyperparameter optimization is a ubiquitous challenge in machine learning, and the performance of a trained model depends crucially upon their effective selection. While a rich set of tools exist for this purpose, there are currently no practical hyperparameter selection methods under the constraint of differential privacy (DP). We study honest hyperparameter selection for differentially private machine learning, in which the process of hyperparameter tuning is accounted for in the overall privacy budget. To this end, we i) show that standard composition tools outperform more advanced techniques in many settings, ii) empirically and theoretically demonstrate an intrinsic connection between the learning rate and clipping norm hyperparameters, iii) show that adaptive optimizers like DPAdam enjoy a significant advantage in the process of honest hyperparameter tuning, and iv) draw upon novel limiting behaviour of Adam in the DP setting to design a new and more efficient optimizer.

DP-SGD Model Selection Privacy Risk Management

Proceedings of the 37th International Conference on Machine Learning

Fair Learning with Private Demographic Data

Hussein Mozannar, Mesrob I. Ohannessian, Nathan Srebro

Published: 2020

The Tenth International Conference on Learning Representations

Hyperparameter Tuning with Renyi Differential Privacy

Nicolas Papernot, Thomas Steinke

Published: 2022

arxiv

Cited by 1

International Conference on Learning Representations (ICLR)

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar

Published: 10.19.2016

Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data: Private Aggregation of Teacher Ensembles (PATE). The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as "teachers" for a "student" model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the underlying data or parameters. The student's privacy properties can be understood both intuitively (since no single teacher and thus no single dataset dictates the student's training) and formally, in terms of differential privacy. These properties hold even if an adversary can not only query the student but also inspect its internal workings. Compared with previous work, the approach imposes only weak assumptions on how teachers are trained: it applies to any model, including non-convex models like DNNs. We achieve state-of-the-art privacy/utility trade-offs on MNIST and SVHN thanks to an improved privacy analysis and semi-supervised learning.

Privacy-Preserving Machine Learning Differential Privacy Self-Supervised Learning

Advances in Neural Information Processing Systems

On Fairness and Calibration

Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, Kilian Q. Weinberger

Published: 2017

European Journal of Operational Research

A data-driven software tool for enabling cooperative information sharing among police departments

Michael Redmond, Alok Baveja

Published: 2002

Scandinavian Journal of Statistics

Empirical Choice of Histograms and Kernel Density Estimators

Mats Rudemo

Published: 1982

IEEE Global Conference on Signal and Information Processing

Stochastic gradient descent with differentially private updates

S. Song, K. Chaudhuri, A. D. Sarwate

Published: 2013

Advances in Neural Information Processing Systems

Parallel Streaming Wasserstein Barycenters

Matthew Staib, Sebastian Claici, Justin Solomon, Stefanie Jegelka

Published: 2017

L∞ Isotonic Regression for Linear, Multidimensional, and Tree Orders

Quentin F. Stout

Published: 2017

Proceedings of the AAAI Conference on Artificial Intelligence

Differentially Private and Fair Deep Learning: A Lagrangian Dual Approach

Cuong Tran, Ferdinando Fioretto, Pascal Van Hentenryck

Published: 2021

Tutorials on the Foundations of Cryptography: Dedicated to Oded Goldreich

The complexity of differential privacy

Vadhan, S.

Published: 2017

Proceedings of the 34th Uncertainty in Artificial Intelligence Conference

Revisiting differentially private linear regression: Optimal and adaptive prediction & estimation in unbounded domain

Yu-Xiang Wang

Published: 2018

Technical Report HPL-2003-97R1, Hewlett-Packard Laboratories

Inequalities for the L1 Deviation of the Empirical Distribution

Tsachy Weissman, Erik Ordentlich, Gadiel Seroussi, Sergio Verdu, Marcelo J. Weinberger

Published: 2003

Law School Admission Council

LSAC National Longitudinal Bar Passage Study

Linda F. Wightman

Published: 1998

Proceedings of the 40th International Conference on Machine Learning

Fair and Optimal Classification via Post-Processing

Ruicheng Xian, Lang Yin, Han Zhao

Published: 2023

Companion proceedings of The 2019 world wide web conference

Achieving differential privacy and fairness in logistic regression

Xu, Depeng, Yuan, Shuhan, Wu, Xintao

Published: 2019

IEEE 28th International Conference on Data Engineering

Differentially Private Histogram Publication

Jia Xu, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, Ge Yu

Published: 2012

Proceedings of the 30th International Conference on Machine Learning

Learning Fair Representations

Richard Zemel, Yu Ledell Wu, Kevin Swersky, Toni Pitassi, Cynthia Dwork

Published: 2013

Journal of Machine Learning Research

Inherent Tradeoffs in Learning Fair Representations

Han Zhao, Geoffrey J. Gordon

Published: 2022