Labels Predicted by AI
Adversarial Learning Watermark Evaluation Reconstruction Attack
Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.
Abstract
Publicly releasing the specification of a model with its trained parameters means an adversary can attempt to reconstruct information about the training data via training data reconstruction attacks, a major vulnerability of modern machine learning methods. This paper makes three primary contributions: establishing a mathematical framework to express the problem, characterising the features of the training data that are vulnerable via a maximum mean discrepancy equivalance and outlining a score matching framework for reconstructing data in both Bayesian and non-Bayesian models, the former is a first in the literature.