In this work we study input gradient regularization of deep neural networks,
and demonstrate that such regularization leads to generalization proofs and
improved adversarial robustness. The proof of generalization does not overcome
the curse of dimensionality, but it is independent of the number of layers in
the networks. The adversarial robustness regularization combines adversarial
training, which we show to be equivalent to Total Variation regularization,
with Lipschitz regularization. We demonstrate empirically that the regularized
models are more robust, and that gradient norms of images can be used for
attack detection.