Note that this paper is superceded by "Black-Box Adversarial Attacks with
Limited Queries and Information."
Current neural network-based image classifiers are susceptible to adversarial
examples, even in the black-box setting, where the attacker is limited to query
access without access to gradients. Previous methods --- substitute networks
and coordinate-based finite-difference methods --- are either unreliable or
query-inefficient, making these methods impractical for certain problems.
We introduce a new method for reliably generating adversarial examples under
more restricted, practical black-box threat models. First, we apply natural
evolution strategies to perform black-box attacks using two to three orders of
magnitude fewer queries than previous methods. Second, we introduce a new
algorithm to perform targeted adversarial attacks in the partial-information
setting, where the attacker only has access to a limited number of target
classes. Using these techniques, we successfully perform the first targeted
adversarial attack against a commercially deployed machine learning system, the
Google Cloud Vision API, in the partial information setting.