Gradient-based adversarial attacks on neural networks can be crafted in a
variety of ways by varying either how the attack algorithm relies on the
gradient, the network architecture used for crafting the attack, or both. Most
recent work has focused on defending classifiers in a case where there is no
uncertainty about the attacker's behavior (i.e., the attacker is expected to
generate a specific attack using a specific network architecture). However, if
the attacker is not guaranteed to behave in a certain way, the literature lacks
methods in devising a strategic defense. We fill this gap by simulating the
attacker's noisy perturbation using a variety of attack algorithms based on
gradients of various classifiers. We perform our analysis using a
pre-processing Denoising Autoencoder (DAE) defense that is trained with the
simulated noise. We demonstrate significant improvements in post-attack
accuracy, using our proposed ensemble-trained defense, compared to a situation
where no effort is made to handle uncertainty.