These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Unlearnable example attacks are data poisoning attacks aiming to degrade the
clean test accuracy of deep learning by adding imperceptible perturbations to
the training samples, which can be formulated as a bi-level optimization
problem. However, directly solving this optimization problem is intractable for
deep neural networks. In this paper, we investigate unlearnable example attacks
from a game-theoretic perspective, by formulating the attack as a nonzero sum
Stackelberg game. First, the existence of game equilibria is proved under the
normal setting and the adversarial training setting. It is shown that the game
equilibrium gives the most powerful poison attack in that the victim has the
lowest test accuracy among all networks within the same hypothesis space, when
certain loss functions are used. Second, we propose a novel attack method,
called the Game Unlearnable Example (GUE), which has three main gradients. (1)
The poisons are obtained by directly solving the equilibrium of the Stackelberg
game with a first-order algorithm. (2) We employ an autoencoder-like generative
network model as the poison attacker. (3) A novel payoff function is introduced
to evaluate the performance of the poison. Comprehensive experiments
demonstrate that GUE can effectively poison the model in various scenarios.
Furthermore, the GUE still works by using a relatively small percentage of the
training data to train the generator, and the poison generator can generalize
to unseen data well. Our implementation code can be found at
https://github.com/hong-xian/gue.