Institute for Artificial Intelligence, Beijing National Research Center for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University
Country
China
Conference
International Conference on Machine Learning (ICML)
These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Although ordinary differential equations (ODEs) provide insights for
designing network architectures, its relationship with the non-residual
convolutional neural networks (CNNs) is still unclear. In this paper, we
present a novel ODE model by adding a damping term. It can be shown that the
proposed model can recover both a ResNet and a CNN by adjusting an
interpolation coefficient. Therefore, the damped ODE model provides a unified
framework for the interpretation of residual and non-residual networks. The
Lyapunov analysis reveals better stability of the proposed model, and thus
yields robustness improvement of the learned networks. Experiments on a number
of image classification benchmarks show that the proposed model substantially
improves the accuracy of ResNet and ResNeXt over the perturbed inputs from both
stochastic noise and adversarial attack methods. Moreover, the loss landscape
analysis demonstrates the improved robustness of our method along the attack
direction.