Recent work has extensively shown that randomized perturbations of neural
networks can improve robustness to adversarial attacks. The literature is,
however, lacking a detailed compare-and-contrast of the latest proposals to
understand what classes of perturbations work, when they work, and why they
work. We contribute a detailed evaluation that elucidates these questions and
benchmarks perturbation based defenses consistently. In particular, we show
five main results: (1) all input perturbation defenses, whether random or
deterministic, are equivalent in their efficacy, (2) attacks transfer between
perturbation defenses so the attackers need not know the specific type of
defense -- only that it involves perturbations, (3) a tuned sequence of noise
layers across a network provides the best empirical robustness, (4)
perturbation based defenses offer almost no robustness to adaptive attacks
unless these perturbations are observed during training, and (5) adversarial
examples in a close neighborhood of original inputs show an elevated
sensitivity to perturbations in first and second-order analyses.