Reconstructing Training Data from Model Gradient, Provably

Authors: Zihan Wang, Jason D. Lee, Qi Lei | Published: 2022-12-07 | Updated: 2023-06-10

2022.12.072025.05.28

Authors: Zihan Wang, Jason D. Lee, Qi Lei
Published: 2022-12-07 | Updated: 2023-06-10

Source: https://arxiv.org/abs/2212.03714

PDF: https://arxiv.org/pdf/2212.03714

Labels Predicted by AI

Algorithm Design Privacy Risk Management Reconstruction Durability

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.