Over recent years, devising classification algorithms that are robust to
adversarial perturbations has emerged as a challenging problem. In particular,
deep neural nets (DNNs) seem to be susceptible to small imperceptible changes
over test instances. However, the line of work in provable robustness, so far,
has been focused on information-theoretic robustness, ruling out even the
existence of any adversarial examples. In this work, we study whether there is
a hope to benefit from algorithmic nature of an attacker that searches for
adversarial examples, and ask whether there is any learning task for which it
is possible to design classifiers that are only robust against polynomial-time
adversaries. Indeed, numerous cryptographic tasks can only be secure against
computationally bounded adversaries, and are indeed impossible for
computationally unbounded attackers. Thus, it is natural to ask if the same
strategy could help robust learning.
We show that computational limitation of attackers can indeed be useful in
robust learning by demonstrating the possibility of a classifier for some
learning task for which computational and information theoretic adversaries of
bounded perturbations have very different power. Namely, while computationally
unbounded adversaries can attack successfully and find adversarial examples
with small perturbation, polynomial time adversaries are unable to do so unless
they can break standard cryptographic hardness assumptions. Our results,
therefore, indicate that perhaps a similar approach to cryptography (relying on
computational hardness) holds promise for achieving computationally robust
machine learning. On the reverse directions, we also show that the existence of
such learning task in which computational robustness beats information
theoretic robustness requires computational hardness by implying (average-case)
hardness of NP.