Users in various web and mobile applications are vulnerable to attribute
inference attacks, in which an attacker leverages a machine learning classifier
to infer a target user's private attributes (e.g., location, sexual
orientation, political view) from its public data (e.g., rating scores, page
likes). Existing defenses leverage game theory or heuristics based on
correlations between the public data and attributes. These defenses are not
practical. Specifically, game-theoretic defenses require solving intractable
optimization problems, while correlation-based defenses incur large utility
loss of users' public data.
In this paper, we present AttriGuard, a practical defense against attribute
inference attacks. AttriGuard is computationally tractable and has small
utility loss. Our AttriGuard works in two phases. Suppose we aim to protect a
user's private attribute. In Phase I, for each value of the attribute, we find
a minimum noise such that if we add the noise to the user's public data, then
the attacker's classifier is very likely to infer the attribute value for the
user. We find the minimum noise via adapting existing evasion attacks in
adversarial machine learning. In Phase II, we sample one attribute value
according to a certain probability distribution and add the corresponding noise
found in Phase I to the user's public data. We formulate finding the
probability distribution as solving a constrained convex optimization problem.
We extensively evaluate AttriGuard and compare it with existing methods using a
real-world dataset. Our results show that AttriGuard substantially outperforms
existing methods. Our work is the first one that shows evasion attacks can be
used as defensive techniques for privacy protection.