Adversarial training, which is to enhance robustness against adversarial
attacks, has received much attention because it is easy to generate
human-imperceptible perturbations of data to deceive a given deep neural
network. In this paper, we propose a new adversarial training algorithm that is
theoretically well motivated and empirically superior to other existing
algorithms. A novel feature of the proposed algorithm is to apply more
regularization to data vulnerable to adversarial attacks than other existing
regularization algorithms do. Theoretically, we show that our algorithm can be
understood as an algorithm of minimizing the regularized empirical risk
motivated from a newly derived upper bound of the robust risk. Numerical
experiments illustrate that our proposed algorithm improves the generalization
(accuracy on examples) and robustness (accuracy on adversarial attacks)
simultaneously to achieve the state-of-the-art performance.