We introduce a two-player contest for evaluating the safety and robustness of
machine learning systems, with a large prize pool. Unlike most prior work in ML
robustness, which studies norm-constrained adversaries, we shift our focus to
unconstrained adversaries. Defenders submit machine learning models, and try to
achieve high accuracy and coverage on non-adversarial data while making no
confident mistakes on adversarial inputs. Attackers try to subvert defenses by
finding arbitrary unambiguous inputs where the model assigns an incorrect label
with high confidence. We propose a simple unambiguous dataset ("bird-or-
bicycle") to use as part of this contest. We hope this contest will help to
more comprehensively evaluate the worst-case adversarial risk of machine
learning models.