Recently, Kannan et al. [2018] proposed several logit regularization methods
to improve the adversarial robustness of classifiers. We show that the
computationally fast methods they propose - Clean Logit Pairing (CLP) and Logit
Squeezing (LSQ) - just make the gradient-based optimization problem of crafting
adversarial examples harder without providing actual robustness. We find that
Adversarial Logit Pairing (ALP) may indeed provide robustness against
adversarial examples, especially when combined with adversarial training, and
we examine it in a variety of settings. However, the increase in adversarial
accuracy is much smaller than previously claimed. Finally, our results suggest
that the evaluation against an iterative PGD attack relies heavily on the
parameters used and may result in false conclusions regarding robustness of a
model.