Regional Image Perturbation Reduces $L_p$ Norms of Adversarial Examples While Maintaining Model-to-model Transferability

TOP Literature Database Regional Image Perturbation Reduces $L_p$ Norms of Adversarial Examples While Maintaining Model-to-model Transferability

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2007.03198

PDF

https://arxiv.org/pdf/2007.03198

Paper Information

Author: Utku Ozbulak,Jonathan Peck,Wesley De Neve,Bart Goossens,Yvan Saeys,Arnout Van Messem
Published: 7-7-2020
Updated: 7-18-2020
Affiliation: Department of Electronics and Information Systems, Ghent University
Country: Belgium
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Adversarial Learning Adversarial Example Attack Pattern Extraction

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Regional adversarial attacks often rely on complicated methods for generating adversarial perturbations, making it hard to compare their efficacy against well-known attacks. In this study, we show that effective regional perturbations can be generated without resorting to complex methods. We develop a very simple regional adversarial perturbation attack method using cross-entropy sign, one of the most commonly used losses in adversarial machine learning. Our experiments on ImageNet with multiple models reveal that, on average, $76\%$ of the generated adversarial examples maintain model-to-model transferability when the perturbation is applied to local image regions. Depending on the selected region, these localized adversarial examples require significantly less $L_p$ norm distortion (for $p \in \{0, 2, \infty\}$) compared to their non-local counterparts. These localized attacks therefore have the potential to undermine defenses that claim robustness under the aforementioned norms.