While being deployed in many critical applications as core components,
machine learning (ML) models are vulnerable to various security and privacy
attacks. One major privacy attack in this domain is membership inference, where
an adversary aims to determine whether a target data sample is part of the
training set of a target ML model. So far, most of the current membership
inference attacks are evaluated against ML models trained from scratch.
However, real-world ML models are typically trained following the transfer
learning paradigm, where a model owner takes a pretrained model learned from a
different dataset, namely teacher model, and trains her own student model by
fine-tuning the teacher model with her own data.
In this paper, we perform the first systematic evaluation of membership
inference attacks against transfer learning models. We adopt the strategy of
shadow model training to derive the data for training our membership inference
classifier. Extensive experiments on four real-world image datasets show that
membership inference can achieve effective performance. For instance, on the
CIFAR100 classifier transferred from ResNet20 (pretrained with Caltech101), our
membership inference achieves $95\%$ attack AUC. Moreover, we show that
membership inference is still effective when the architecture of target model
is unknown. Our results shed light on the severity of membership risks stemming
from machine learning models in practice.