These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Machine learning models can be trained with formal privacy guarantees via
differentially private optimizers such as DP-SGD. In this work, we focus on a
threat model where the adversary has access only to the final model, with no
visibility into intermediate updates. In the literature, this hidden state
threat model exhibits a significant gap between the lower bound from empirical
privacy auditing and the theoretical upper bound provided by privacy
accounting. To challenge this gap, we propose to audit this threat model with
adversaries that craft a gradient sequence designed to maximize the privacy
loss of the final model without relying on intermediate updates. Our
experiments show that this approach consistently outperforms previous attempts
at auditing the hidden state model. Furthermore, our results advance the
understanding of achievable privacy guarantees within this threat model.
Specifically, when the crafted gradient is inserted at every optimization step,
we show that concealing the intermediate model updates in DP-SGD does not
enhance the privacy guarantees. The situation is more complex when the crafted
gradient is not inserted at every step: our auditing lower bound matches the
privacy upper bound only for an adversarially-chosen loss landscape and a
sufficiently large batch size. This suggests that existing privacy upper bounds
can be improved in certain regimes.