The problem of adversarial examples, evasion attacks on machine learning
classifiers, has proven extremely difficult to solve. This is true even when,
as is the case in many practical settings, the classifier is hosted as a remote
service and so the adversary does not have direct access to the model
parameters.
This paper argues that in such settings, defenders have a much larger space
of actions than have been previously explored. Specifically, we deviate from
the implicit assumption made by prior work that a defense must be a stateless
function that operates on individual examples, and explore the possibility for
stateful defenses.
To begin, we develop a defense designed to detect the process of adversarial
example generation. By keeping a history of the past queries, a defender can
try to identify when a sequence of queries appears to be for the purpose of
generating an adversarial example. We then introduce query blinding, a new
class of attacks designed to bypass defenses that rely on such a defense
approach.
We believe that expanding the study of adversarial examples from stateless
classifiers to stateful systems is not only more realistic for many black-box
settings, but also gives the defender a much-needed advantage in responding to
the adversary.