Department of Computer Science and Technology, Institute for AI, BNRist Center, THBI Lab, Tsinghua-Fuzhou Institute for Data Technology, Tsinghua University
所属の国
China
会議名
International Conference on Machine Learning (ICML)
Though deep neural networks have achieved significant progress on various
tasks, often enhanced by model ensemble, existing high-performance models can
be vulnerable to adversarial attacks. Many efforts have been devoted to
enhancing the robustness of individual networks and then constructing a
straightforward ensemble, e.g., by directly averaging the outputs, which
ignores the interaction among networks. This paper presents a new method that
explores the interaction among individual networks to improve robustness for
ensemble models. Technically, we define a new notion of ensemble diversity in
the adversarial setting as the diversity among non-maximal predictions of
individual members, and present an adaptive diversity promoting (ADP)
regularizer to encourage the diversity, which leads to globally better
robustness for the ensemble by making adversarial examples difficult to
transfer among individual members. Our method is computationally efficient and
compatible with the defense methods acting on individual networks. Empirical
results on various datasets verify that our method can improve adversarial
robustness while maintaining state-of-the-art accuracy on normal examples.