Membership inference attacks seek to infer membership of individual training
instances of a model to which an adversary has black-box access through a
machine learning-as-a-service API. In providing an in-depth characterization of
membership privacy risks against machine learning models, this paper presents a
comprehensive study towards demystifying membership inference attacks from two
complimentary perspectives. First, we provide a generalized formulation of the
development of a black-box membership inference attack model. Second, we
characterize the importance of model choice on model vulnerability through a
systematic evaluation of a variety of machine learning models and model
combinations using multiple datasets. Through formal analysis and empirical
evidence from extensive experimentation, we characterize under what conditions
a model may be vulnerable to such black-box membership inference attacks. We
show that membership inference vulnerability is data-driven and corresponding
attack models are largely transferable. Though different model types display
different vulnerabilities to membership inference, so do different datasets.
Our empirical results additionally show that (1) using the type of target model
under attack within the attack model may not increase attack effectiveness and
(2) collaborative learning exposes vulnerabilities to membership inference
risks when the adversary is a participant. We also discuss countermeasure and
mitigation strategies.