Crowdsourced data used in machine learning services might carry sensitive
information about attributes that users do not want to share. Various methods
have been proposed to minimize the potential information leakage of sensitive
attributes while maximizing the task accuracy. However, little is known about
the theory behind these methods. In light of this gap, we develop a novel
theoretical framework for attribute obfuscation. Under our framework, we
propose a minimax optimization formulation to protect the given attribute and
analyze its inference guarantees against worst-case adversaries. Meanwhile, it
is clear that in general there is a tension between minimizing information
leakage and maximizing task accuracy. To understand this, we prove an
information-theoretic lower bound to precisely characterize the fundamental
trade-off between accuracy and information leakage. We conduct experiments on
two real-world datasets to corroborate the inference guarantees and validate
this trade-off. Our results indicate that, among several alternatives, the
adversarial learning approach achieves the best trade-off in terms of attribute
obfuscation and accuracy maximization.