Deep neural networks have lately shown tremendous performance in various
applications including vision and speech processing tasks. However, alongside
their ability to perform these tasks with such high accuracy, it has been shown
that they are highly susceptible to adversarial attacks: a small change in the
input would cause the network to err with high confidence. This phenomenon
exposes an inherent fault in these networks and their ability to generalize
well. For this reason, providing robustness to adversarial attacks is an
important challenge in networks training, which has led to extensive research.
In this work, we suggest a theoretically inspired novel approach to improve the
networks' robustness. Our method applies regularization using the Frobenius
norm of the Jacobian of the network, which is applied as post-processing, after
regular training has finished. We demonstrate empirically that it leads to
enhanced robustness results with a minimal change in the original network's
accuracy.