These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Motivated by the recent discovery that the interpretation maps of CNNs could
easily be manipulated by adversarial attacks against network interpretability,
we study the problem of interpretation robustness from a new perspective of
\Renyi differential privacy (RDP). The advantages of our Renyi-Robust-Smooth
(RDP-based interpretation method) are three-folds. First, it can offer provable
and certifiable top-$k$ robustness. That is, the top-$k$ important attributions
of the interpretation map are provably robust under any input perturbation with
bounded $\ell_d$-norm (for any $d\geq 1$, including $d = \infty$). Second, our
proposed method offers $\sim10\%$ better experimental robustness than existing
approaches in terms of the top-$k$ attributions. Remarkably, the accuracy of
Renyi-Robust-Smooth also outperforms existing approaches. Third, our method can
provide a smooth tradeoff between robustness and computational efficiency.
Experimentally, its top-$k$ attributions are {\em twice} more robust than
existing approaches when the computational resources are highly constrained.