A key problem in research on adversarial examples is that vulnerability to
adversarial examples is usually measured by running attack algorithms. Because
the attack algorithms are not optimal, the attack algorithms are prone to
overestimating the size of perturbation needed to fool the target model. In
other words, the attack-based methodology provides an upper-bound on the size
of a perturbation that will fool the model, but security guarantees require a
lower bound. CLEVER is a proposed scoring method to estimate a lower bound.
Unfortunately, an estimate of a bound is not a bound. In this report, we show
that gradient masking, a common problem that causes attack methodologies to
provide only a very loose upper bound, causes CLEVER to overestimate the size
of perturbation needed to fool the model. In other words, CLEVER does not
resolve the key problem with the attack-based methodology, because it fails to
provide a lower bound.