These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Transfer learning has become a common solution to address training data
scarcity in practice. It trains a specified student model by reusing or
fine-tuning early layers of a well-trained teacher model that is usually
publicly available. However, besides utility improvement, the transferred
public knowledge also brings potential threats to model confidentiality, and
even further raises other security and privacy issues.
In this paper, we present the first comprehensive investigation of the
teacher model exposure threat in the transfer learning context, aiming to gain
a deeper insight into the tension between public knowledge and model
confidentiality. To this end, we propose a teacher model fingerprinting attack
to infer the origin of a student model, i.e., the teacher model it transfers
from. Specifically, we propose a novel optimization-based method to carefully
generate queries to probe the student model to realize our attack. Unlike
existing model reverse engineering approaches, our proposed fingerprinting
method neither relies on fine-grained model outputs, e.g., posteriors, nor
auxiliary information of the model architecture or training dataset. We
systematically evaluate the effectiveness of our proposed attack. The empirical
results demonstrate that our attack can accurately identify the model origin
with few probing queries. Moreover, we show that the proposed attack can serve
as a stepping stone to facilitating other attacks against machine learning
models, such as model stealing.