These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Large pretrained models can be privately fine-tuned to achieve performance
approaching that of non-private models. A common theme in these results is the
surprising observation that high-dimensional models can achieve favorable
privacy-utility trade-offs. This seemingly contradicts known results on the
model-size dependence of differentially private convex learning and raises the
following research question: When does the performance of differentially
private learning not degrade with increasing model size? We identify that the
magnitudes of gradients projected onto subspaces is a key factor that
determines performance. To precisely characterize this for private convex
learning, we introduce a condition on the objective that we term
\emph{restricted Lipschitz continuity} and derive improved bounds for the
excess empirical and population risks that are dimension-independent under
additional conditions. We empirically show that in private fine-tuning of large
language models, gradients obtained during fine-tuning are mostly controlled by
a few principal components. This behavior is similar to conditions under which
we obtain dimension-independent bounds in convex settings. Our theoretical and
empirical results together provide a possible explanation for recent successes
in large-scale private fine-tuning. Code to reproduce our results can be found
at
\url{https://github.com/lxuechen/private-transformers/tree/main/examples/classification/spectral_analysis}.