These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The machine learning problem of extracting neural network parameters has been
proposed for nearly three decades. Functionally equivalent extraction is a
crucial goal for research on this problem. When the adversary has access to the
raw output of neural networks, various attacks, including those presented at
CRYPTO 2020 and EUROCRYPT 2024, have successfully achieved this goal. However,
this goal is not achieved when neural networks operate under a hard-label
setting where the raw output is inaccessible.
In this paper, we propose the first attack that theoretically achieves
functionally equivalent extraction under the hard-label setting, which applies
to ReLU neural networks. The effectiveness of our attack is validated through
practical experiments on a wide range of ReLU neural networks, including neural
networks trained on two real benchmarking datasets (MNIST, CIFAR10) widely used
in computer vision. For a neural network consisting of $10^5$ parameters, our
attack only requires several hours on a single core.