The success of deep learning relies on the availability of large-scale
annotated data sets, the acquisition of which can be costly, requiring expert
domain knowledge. Semi-supervised learning (SSL) mitigates this challenge by
exploiting the behavior of the neural function on large unlabeled data. The
smoothness of the neural function is a commonly used assumption exploited in
SSL. A successful example is the adoption of mixup strategy in SSL that
enforces the global smoothness of the neural function by encouraging it to
behave linearly when interpolating between training examples. Despite its
empirical success, however, the theoretical underpinning of how mixup
regularizes the neural function has not been fully understood. In this paper,
we offer a theoretically substantiated proposition that mixup improves the
smoothness of the neural function by bounding the Lipschitz constant of the
gradient function of the neural networks. We then propose that this can be
strengthened by simultaneously constraining the Lipschitz constant of the
neural function itself through adversarial Lipschitz regularization,
encouraging the neural function to behave linearly while also constraining the
slope of this linear function. On three benchmark data sets and one real-world
biomedical data set, we demonstrate that this combined regularization results
in improved generalization performance of SSL when learning from a small amount
of labeled data. We further demonstrate the robustness of the presented method
against single-step adversarial attacks. Our code is available at
https://github.com/Prasanna1991/Mixup-LR.