We propose a criterion for discrimination against a specified sensitive
attribute in supervised learning, where the goal is to predict some target
based on available features. Assuming data about the predictor, target, and
membership in the protected group are available, we show how to optimally
adjust any learned predictor so as to remove discrimination according to our
definition. Our framework also improves incentives by shifting the cost of poor
classification from disadvantaged groups to the decision maker, who can respond
by improving the classification accuracy.
In line with other studies, our notion is oblivious: it depends only on the
joint statistics of the predictor, the target and the protected attribute, but
not on interpretation of individualfeatures. We study the inherent limits of
defining and identifying biases based on such oblivious measures, outlining
what can and cannot be inferred from different oblivious tests.
We illustrate our notion using a case study of FICO credit scores.