AIにより推定されたラベル
※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。
Abstract
In adversarial machine learning, the popular ℓ∞ threat model has been the focus of much previous work. While this mathematical definition of imperceptibility successfully captures an infinite set of additive image transformations that a model should be robust to, this is only a subset of all transformations which leave the semantic label of an image unchanged. Indeed, previous work also considered robustness to spatial attacks as well as other semantic transformations; however, designing defense methods against the composition of spatial and ℓ∞ perturbations remains relatively underexplored. In the following, we improve the understanding of this seldom investigated compositional setting. We prove theoretically that no linear classifier can achieve more than trivial accuracy against a composite adversary in a simple statistical setting, illustrating its difficulty. We then investigate how state-of-the-art ℓ∞ defenses can be adapted to this novel threat model and study their performance against compositional attacks. We find that our newly proposed TRADESAll strategy performs the strongest of all. Analyzing its logit’s Lipschitz constant for RT transformations of different sizes, we find that TRADESAll remains stable over a wide range of RT transformations with and without ℓ∞ perturbations.