We identify three common cases that lead to overestimation of adversarial
accuracy against bounded first-order attack methods, which is popularly used as
a proxy for adversarial robustness in empirical studies. For each case, we
propose compensation methods that either address sources of inaccurate gradient
computation, such as numerical instability near zero and non-differentiability,
or reduce the total number of back-propagations for iterative attacks by
approximating second-order information. These compensation methods can be
combined with existing attack methods for a more precise empirical evaluation
metric. We illustrate the impact of these three cases with examples of
practical interest, such as benchmarking model capacity and regularization
techniques for robustness. Overall, our work shows that overestimated
adversarial accuracy that is not indicative of robustness is prevalent even for
conventionally trained deep neural networks, and highlights cautions of using
empirical evaluation without guaranteed bounds.