Recent studies on the adversarial vulnerability of neural networks have shown
that models trained with the objective of minimizing an upper bound on the
worst-case loss over all possible adversarial perturbations improve robustness
against adversarial attacks. Beside exploiting adversarial training framework,
we show that by enforcing a Deep Neural Network (DNN) to be linear in
transformed input and feature space improves robustness significantly. We also
demonstrate that by augmenting the objective function with Local Lipschitz
regularizer boost robustness of the model further. Our method outperforms most
sophisticated adversarial training methods and achieves state of the art
adversarial accuracy on MNIST, CIFAR10 and SVHN dataset. In this paper, we also
propose a novel adversarial image generation method by leveraging Inverse
Representation Learning and Linearity aspect of an adversarially trained deep
neural network classifier.