In this paper, we present a novel insider attack called Matryoshka, which
employs an irrelevant scheduled-to-publish DNN model as a carrier model for
covert transmission of multiple secret models which memorize the functionality
of private ML data stored in local data centers. Instead of treating the
parameters of the carrier model as bit strings and applying conventional
steganography, we devise a novel parameter sharing approach which exploits the
learning capacity of the carrier model for information hiding. Matryoshka
simultaneously achieves: (i) High Capacity -- With almost no utility loss of
the carrier model, Matryoshka can hide a 26x larger secret model or 8 secret
models of diverse architectures spanning different application domains in the
carrier model, neither of which can be done with existing steganography
techniques; (ii) Decoding Efficiency -- once downloading the published carrier
model, an outside colluder can exclusively decode the hidden models from the
carrier model with only several integer secrets and the knowledge of the hidden
model architecture; (iii) Effectiveness -- Moreover, almost all the recovered
models have similar performance as if it were trained independently on the
private data; (iv) Robustness -- Information redundancy is naturally
implemented to achieve resilience against common post-processing techniques on
the carrier before its publishing; (v) Covertness -- A model inspector with
different levels of prior knowledge could hardly differentiate a carrier model
from a normal model.