These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
We study the problem of computing the privacy parameters for DP machine
learning when using privacy amplification via random batching and noise
correlated across rounds via a correlation matrix $\textbf{C}$ (i.e., the
matrix mechanism). Past work on this problem either only applied to banded
$\textbf{C}$, or gave loose privacy parameters. In this work, we give a
framework for computing near-exact privacy parameters for any lower-triangular,
non-negative $\textbf{C}$. Our framework allows us to optimize the correlation
matrix $\textbf{C}$ while accounting for amplification, whereas past work could
not. Empirically, we show this lets us achieve smaller RMSE on prefix sums than
the previous state-of-the-art (SOTA). We also show that we can improve on the
SOTA performance on deep learning tasks. Our two main technical tools are (i)
using Monte Carlo accounting to bypass composition, which was the main
technical challenge for past work, and (ii) a "balls-in-bins" batching scheme
that enables easy privacy analysis and is closer to practical random batching
than Poisson sampling.