A principled approach for generating adversarial images under non-smooth dissimilarity metrics

Authors: Aram-Alexandre Pooladian, Chris Finlay, Tim Hoheisel, Adam Oberman | Published: 2019-08-05 | Updated: 2019-10-08

2019.08.052025.05.28

Authors: Aram-Alexandre Pooladian, Chris Finlay, Tim Hoheisel, Adam Oberman
Published: 2019-08-05 | Updated: 2019-10-08

Source: https://arxiv.org/abs/1908.01667

PDF: https://arxiv.org/pdf/1908.01667

Labels Predicted by AI

Adversarial Attack Methods Attack Evaluation Robustness Improvement Method

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Deep neural networks perform well on real world data but are prone to adversarial perturbations: small changes in the input easily lead to misclassification. In this work, we propose an attack methodology not only for cases where the perturbations are measured by ℓ_p norms, but in fact any adversarial dissimilarity metric with a closed proximal form. This includes, but is not limited to, ℓ₁, ℓ₂, and ℓ_∞ perturbations; the ℓ₀ counting “norm” (i.e. true sparseness); and the total variation seminorm, which is a (non-ℓ_p) convolutional dissimilarity measuring local pixel changes. Our approach is a natural extension of a recent adversarial attack method, and eliminates the differentiability requirement of the metric. We demonstrate our algorithm, ProxLogBarrier, on the MNIST, CIFAR10, and ImageNet-1k datasets. We consider undefended and defended models, and show that our algorithm easily transfers to various datasets. We observe that ProxLogBarrier outperforms a host of modern adversarial attacks specialized for the ℓ₀ case. Moreover, by altering images in the total variation seminorm, we shed light on a new class of perturbations that exploit neighboring pixel information.