This paper investigates recently proposed approaches for defending against
adversarial examples and evaluating adversarial robustness. We motivate
'adversarial risk' as an objective for achieving models robust to worst-case
inputs. We then frame commonly used attacks and evaluation metrics as defining
a tractable surrogate objective to the true adversarial risk. This suggests
that models may optimize this surrogate rather than the true adversarial risk.
We formalize this notion as 'obscurity to an adversary,' and develop tools and
heuristics for identifying obscured models and designing transparent models. We
demonstrate that this is a significant problem in practice by repurposing
gradient-free optimization techniques into adversarial attacks, which we use to
decrease the accuracy of several recently proposed defenses to near zero. Our
hope is that our formulations and results will help researchers to develop more
powerful defenses.