Quantum machine learning models have the potential to offer speedups and
better predictive accuracy compared to their classical counterparts. However,
these quantum algorithms, like their classical counterparts, have been shown to
also be vulnerable to input perturbations, in particular for classification
problems. These can arise either from noisy implementations or, as a worst-case
type of noise, adversarial attacks. In order to develop defence mechanisms and
to better understand the reliability of these algorithms, it is crucial to
understand their robustness properties in presence of natural noise sources or
adversarial manipulation. From the observation that measurements involved in
quantum classification algorithms are naturally probabilistic, we uncover and
formalize a fundamental link between binary quantum hypothesis testing and
provably robust quantum classification. This link leads to a tight robustness
condition which puts constraints on the amount of noise a classifier can
tolerate, independent of whether the noise source is natural or adversarial.
Based on this result, we develop practical protocols to optimally certify
robustness. Finally, since this is a robustness condition against worst-case
types of noise, our result naturally extends to scenarios where the noise
source is known. Thus, we also provide a framework to study the reliability of
quantum classification protocols beyond the adversarial, worst-case noise
scenarios.