Provable robustness against all adversarial $l_p$-perturbations for $p\geq 1$

AIにより推定されたラベル
Abstract

In recent years several adversarial attacks and defenses have been proposed. Often seemingly robust models turn out to be non-robust when more sophisticated attacks are used. One way out of this dilemma are provable robustness guarantees. While provably robust models for specific lp-perturbation models have been developed, we show that they do not come with any guarantee against other lq-perturbations. We propose a new regularization scheme, MMR-Universal, for ReLU networks which enforces robustness wrt l1– and l-perturbations and show how that leads to the first provably robust models wrt any lp-norm for p ≥ 1.

タイトルとURLをコピーしました