Deep models, while being extremely versatile and accurate, are vulnerable to
adversarial attacks: slight perturbations that are imperceptible to humans can
completely flip the prediction of deep models. Many attack and defense
mechanisms have been proposed, although a satisfying solution still largely
remains elusive. In this work, we give strong evidence that during training,
deep models maximize the minimum margin in order to achieve high accuracy, but
at the same time decrease the \emph{average} margin hence hurting robustness.
Our empirical results highlight an intrinsic trade-off between accuracy and
robustness for current deep model training. To further address this issue, we
propose a new regularizer to explicitly promote average margin, and we verify
through extensive experiments that it does lead to better robustness. Our
regularized objective remains Fisher-consistent, hence asymptotically can still
recover the Bayes optimal classifier.