These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Deep learning models are usually black boxes when deployed on machine
learning platforms. Prior works have shown that the attributes ($e.g.$, the
number of convolutional layers) of a target black-box neural network can be
exposed through a sequence of queries. There is a crucial limitation: these
works assume the dataset used for training the target model to be known
beforehand and leverage this dataset for model attribute attack. However, it is
difficult to access the training dataset of the target black-box model in
reality. Therefore, whether the attributes of a target black-box model could be
still revealed in this case is doubtful. In this paper, we investigate a new
problem of Domain-agnostic Reverse Engineering the Attributes of a black-box
target Model, called DREAM, without requiring the availability of the target
model's training dataset, and put forward a general and principled framework by
casting this problem as an out of distribution (OOD) generalization problem. In
this way, we can learn a domain-agnostic model to inversely infer the
attributes of a target black-box model with unknown training data. This makes
our method one of the kinds that can gracefully apply to an arbitrary domain
for model attribute reverse engineering with strong generalization ability.
Extensive experimental studies are conducted and the results validate the
superiority of our proposed method over the baselines.