Despite the vast success of Deep Neural Networks in numerous application
domains, it has been shown that such models are not robust i.e., they are
vulnerable to small adversarial perturbations of the input. While extensive
work has been done on why such perturbations occur or how to successfully
defend against them, we still do not have a complete understanding of
robustness. In this work, we investigate the connection between robustness and
simplicity. We find that simpler classifiers, formed by reducing the number of
output classes, are less susceptible to adversarial perturbations.
Consequently, we demonstrate that decomposing a complex multiclass model into
an aggregation of binary models enhances robustness. This behavior is
consistent across different datasets and model architectures and can be
combined with known defense techniques such as adversarial training. Moreover,
we provide further evidence of a disconnect between standard and robust
learning regimes. In particular, we show that elaborate label information can
help standard accuracy but harm robustness.