Adaptive Generation of Unrestricted Adversarial Inputs

AIにより推定されたラベル
Abstract

Neural networks are vulnerable to adversarially-constructed perturbations of their inputs. Most research so far has considered perturbations of a fixed magnitude under some lp norm. Although studying these attacks is valuable, there has been increasing interest in the construction of (and robustness to) unrestricted attacks, which are not constrained to a small and rather artificial subset of all possible adversarial inputs. We introduce a novel algorithm for generating such unrestricted adversarial inputs which, unlike prior work, is adaptive: it is able to tune its attacks to the classifier being targeted. It also offers a 400-2,000x speedup over the existing state of the art. We demonstrate our approach by generating unrestricted adversarial inputs that fool classifiers robust to perturbation-based attacks. We also show that, by virtue of being adaptive and unrestricted, our attack is able to defeat adversarial training against it.

タイトルとURLをコピーしました