The robustness of neural networks to adversarial examples has received great
attention due to security implications. Despite various attack approaches to
crafting visually imperceptible adversarial examples, little has been developed
towards a comprehensive measure of robustness. In this paper, we provide a
theoretical justification for converting robustness analysis into a local
Lipschitz constant estimation problem, and propose to use the Extreme Value
Theory for efficient evaluation. Our analysis yields a novel robustness metric
called CLEVER, which is short for Cross Lipschitz Extreme Value for nEtwork
Robustness. The proposed CLEVER score is attack-agnostic and computationally
feasible for large neural networks. Experimental results on various networks,
including ResNet, Inception-v3 and MobileNet, show that (i) CLEVER is aligned
with the robustness indication measured by the $\ell_2$ and $\ell_\infty$ norms
of adversarial examples from powerful attacks, and (ii) defended networks using
defensive distillation or bounded ReLU indeed achieve better CLEVER scores. To
the best of our knowledge, CLEVER is the first attack-independent robustness
metric that can be applied to any neural network classifier.