The collection and availability of big data, combined with advances in
pre-trained models (e.g., BERT, XLNET, etc), have revolutionized the predictive
performance of modern natural language processing tasks, ranging from text
classification to text generation. This allows corporations to provide machine
learning as a service (MLaaS) by encapsulating fine-tuned BERT-based models as
APIs. However, BERT-based APIs have exhibited a series of security and privacy
vulnerabilities. For example, prior work has exploited the security issues of
the BERT-based APIs through the adversarial examples crafted by the extracted
model. However, the privacy leakage problems of the BERT-based APIs through the
extracted model have not been well studied. On the other hand, due to the high
capacity of BERT-based APIs, the fine-tuned model is easy to be overlearned,
but what kind of information can be leaked from the extracted model remains
unknown. In this work, we bridge this gap by first presenting an effective
model extraction attack, where the adversary can practically steal a BERT-based
API (the target/victim model) by only querying a limited number of queries. We
further develop an effective attribute inference attack which can infer the
sensitive attribute of the training data used by the BERT-based APIs. Our
extensive experiments on benchmark datasets under various realistic settings
validate the potential vulnerabilities of BERT-based APIs. Moreover, we
demonstrate that two promising defense methods become ineffective against our
attacks, which calls for more effective defense methods.