Diversity can be Transferred: Output Diversification for White- and Black-box Attacks

TOP Literature Database Diversity can be Transferred: Output Diversification for White- and Black-box Attacks

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2003.06878

PDF

https://arxiv.org/pdf/2003.06878

Paper Information

Author: Yusuke Tashiro,Yang Song,Stefano Ermon
Published: 3-16-2020
Updated: 10-30-2020
Affiliation: Department of Computer Science, Stanford University
Country: United States of America
Conference: Conference on Neural Information Processing Systems (NeurIPS)

Labels Estimated by AI

Adversarial Attack Methods Poisoning Vulnerability Attack Method

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Adversarial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks. These simple perturbations, however, could be sub-optimal as they are agnostic to the model being attacked. To improve the efficiency of these attacks, we propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples. While ODS is a gradient-based strategy, the diversity offered by ODS is transferable and can be helpful for both white-box and black-box attacks via surrogate models. Empirically, we demonstrate that ODS significantly improves the performance of existing white-box and black-box attacks. In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.

External Datasets

MNIST

CIFAR-10

ImageNet