Membership inference attacks are a key measure to evaluate privacy leakage in
machine learning (ML) models. These attacks aim to distinguish training members
from non-members by exploiting differential behavior of the models on member
and non-member inputs. The goal of this work is to train ML models that have
high membership privacy while largely preserving their utility; we therefore
aim for an empirical membership privacy guarantee as opposed to the provable
privacy guarantees provided by techniques like differential privacy, as such
techniques are shown to deteriorate model utility. Specifically, we propose a
new framework to train privacy-preserving models that induces similar behavior
on member and non-member inputs to mitigate membership inference attacks. Our
framework, called SELENA, has two major components. The first component and the
core of our defense is a novel ensemble architecture for training. This
architecture, which we call Split-AI, splits the training data into random
subsets, and trains a model on each subset of the data. We use an adaptive
inference strategy at test time: our ensemble architecture aggregates the
outputs of only those models that did not contain the input sample in their
training data. We prove that our Split-AI architecture defends against a large
family of membership inference attacks, however, it is susceptible to new
adaptive attacks. Therefore, we use a second component in our framework called
Self-Distillation to protect against such stronger attacks. The
Self-Distillation component (self-)distills the training dataset through our
Split-AI ensemble, without using any external public datasets. Through
extensive experiments on major benchmark datasets we show that SELENA presents
a superior trade-off between membership privacy and utility compared to the
state of the art.